Alcohol dehydrogenases (adh) useful for fermentive production of lower alkyl alcohols

ABSTRACT

The invention relates to suitable candidate ADH enzymes for production of lower alkyl alcohols including isobutanol. The invention also relates to recombinant host cells that comprise such ADH enzymes and methods for producing lower alkyl alcohols in the same.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates to the fields of industrial microbiology andalcohol production. Specifically, the invention relates suitable alcoholdehydrogenases for the production of lower alkyl alcohols via anengineered pathway in microorganisms. More specifically, the inventionrelates to suitable alcohol dehydrogenases for the production ofbutanol, particularly isobutanol, via an engineered pathway inmicroorganisms.

2. Background Art

Butanol is an important industrial chemical, useful as a fuel additive,as a feedstock chemical in the plastics industry, and as a food gradeextractant in the food and flavor industry. Each year 10 to 12 billionpounds of butanol are produced by petrochemical means and the need forthis commodity chemical will likely increase in the future.

Methods for the chemical synthesis of isobutanol are known, such as oxosynthesis, catalytic hydrogenation of carbon monoxide (Ullmann'sEncyclopedia of Industrial Chemistry, 6th edition, 2003, Wiley-VCHVerlagGmbH and Co., Weinheim, Germany, Vol. 5, pp. 716-719) and Guerbetcondensation of methanol with n-propanol (Carlini et al., J. Molec.Catal. A: Chem. 220:215-220, 2004). These processes use startingmaterials derived from petrochemicals, are generally expensive, and arenot environmentally friendly.

Isobutanol is produced biologically as a by-product of yeastfermentation. It is a component of “fusel oil” that forms as a result ofthe incomplete metabolism of amino acids by this group of fungi.Isobutanol is specifically produced from catabolism of L-valine. Afterthe amine group of L-valine is harvested as a nitrogen source, theresulting α-keto acid is decarboxylated and reduced to isobutanol byenzymes of the so-called Ehrlich pathway (Dickinson et al., J. Biol.Chem. 273:25752-25756, 1998). Yields of fusel oil and/or its componentsachieved during beverage fermentation are typically low. For example,the concentration of isobutanol produced in beer fermentation isreported to be less than 16 parts per million (Garcia et al., ProcessBiochemistry 29:303-309, 1994). Addition of exogenous L-valine to thefermentation mixture increases the yield of isobutanol, as described byDickinson et al., supra, wherein it is reported that a yield ofisobutanol of 3 g/L is obtained by providing L-valine at a concentrationof 20 g/L in the fermentation mixture. In addition, production ofn-propanol, isobutanol and isoamylalcohol has been shown by calciumalginate immobilized cells of Zymomonas mobilis. A 10%glucose-containing medium supplemented with either L-Leu, L-Ile, L-Val,α-ketoisocaproic acid (α-KCA), α-ketobutyric acid (α-KBA) orα-ketoisovaleric acid (α-KVA) was used (Oaxaca, et al., Acta Biotechnol.11:523-532, 1991). α-KCA increased isobutanol levels. The amino acidsalso gave corresponding alcohols, but to a lesser degree than the ketoacids. An increase in the yield of C₃-C₅ alcohols from carbohydrates wasshown when amino acids leucine, isoleucine, and/or valine were added tothe growth medium as the nitrogen source (PCT Publ. No. WO 2005/040392).

Whereas the methods described above indicate the potential of isobutanolproduction via biological means, these methods are cost prohibitive forindustrial scale isobutanol production.

For an efficient biosynthetic process, an optimal enzyme is required atthe last step to rapidly convert isobutyraldehyde to isobutanol.Furthermore, an accumulation of isobutyraldehyde in the production hostnormally leads to undesirable cellular toxicity.

Alcohol dehydrogenases (ADHs) are a family of proteins comprising alarge group of enzymes that catalyze the interconversion of aldehydesand alcohols (de Smidt et al., FEMS Yeast Res., 8:967-978, 2008), withvarying specificities for different alcohols and aldehydes. There is aneed to identify suitable ADH enzymes to catalyze the formation ofproduct alcohols in recombinant microorganisms. There is also a need toidentify a suitable ADH enzyme that would catalyze the formation ofisobutanol at a high rate, with specific affinity for isobutyraldehydeas the substrate and in the presence of high levels of isobutanol.

BRIEF SUMMARY OF THE INVENTION

One aspect of the invention is directed to a recombinant microbial hostcell comprising a heterologous polynucleotide that encodes a polypeptidewherein the polypeptide has alcohol dehydrogenase activity. Inembodiments, the recombinant microbial host cell further comprises abiosynthetic pathway for the production of a lower alkyl alcohol,wherein the biosynthetic pathway comprises a substrate to productconversion catalyzed by a polypeptide with alcohol dehydrogenaseactivity. In embodiments, the polypeptide has alcohol dehydrogenaseactivity and one or more of the following characteristics: (a) the K_(M)value for a lower alkyl aldehyde is lower for the polypeptide relativeto a control polypeptide having the amino acid sequence of SEQ ID NO:26; (b) the K_(I) value for a lower alkyl alcohol for the polypeptide ishigher relative to a control polypeptide having the amino acid sequenceof SEQ ID NO: 26; and (c) the k_(cat)/K_(M) value for a lower alkylaldehyde for the polypeptide is higher relative to a control polypeptidehaving the amino acid sequence of SEQ ID NO: 26. In embodiments, thepolypeptide having alcohol dehydrogenase activity has two or more of theabove-listed characteristics. In embodiments, the polypeptidepreferentially uses NADH as a cofactor. In embodiments, the polypeptidehaving alcohol dehydrogenase activity has three of the above-listedcharacteristics. In embodiments, the biosynthetic pathway for productionof a lower alkyl alcohol is a butanol, propanol, isopropanol, or ethanolbiosynthetic pathway. In one embodiment, the biosynthetic pathway forproduction of a lower alkyl alcohol is a butanol biosynthetic pathway.

Accordingly, one aspect of the invention is a recombinant microbial hostcell comprising: a biosynthetic pathway for production of a lower alkylalcohol, the biosynthetic pathway comprising a substrate to productconversion catalyzed by a polypeptide with alcohol dehydrogenaseactivity and one or more, two or more, or all of the followingcharacteristics: (a) the K_(M) value for isobutyraldehyde is lower forsaid polypeptide relative to a control polypeptide having the amino acidsequence of SEQ ID NO: 26; (b) the K_(I) value for isobutanol for saidpolypeptide is higher relative to a control polypeptide having the aminoacid sequence of SEQ ID NO: 26; and (c) the k_(cat)/K_(M) valueisobutyraldehyde for said polypeptide is higher relative to a controlpolypeptide having the amino acid sequence of SEQ ID NO: 26. Inembodiments, the biosynthetic pathway for production of a lower alkylalcohol is a butanol, propanol, isopropanol, or ethanol biosyntheticpathway. In embodiments, the polypeptide with alcohol dehydrogenaseactivity has at least 90% identity to the amino acid sequence of SEQ IDNO: 21, 22, 23, 24, 25, 31, 32, 34, 35, 36, 37, or 38. In embodiments,the polypeptide with alcohol dehydrogenase activity has the amino acidsequence of SEQ ID NO: 31. In embodiments, the polypeptide with alcoholdehydrogenase activity is encoded by a polynucleotide having at least90% identity to a nucleotide sequence of SEQ ID NO: 1, 2, 3, 4, 5, 6,11, 12, 14, 15, 16, or 17. In embodiments, polypeptide having alcoholdehydrogenase activity catalyzes the conversion of isobutyraldehyde toisobutanol in the presence of isobutanol at a concentration of at leastabout 10 g/L, at least about 15 g/L, or at least about 20 g/L.

In embodiments, the biosynthetic pathway for production of a lower alkylalcohol is an isobutanol biosynthetic pathway comprising heterologouspolynucleotides encoding polypeptides that catalyze substrate to productconversions for each step of the following steps: (a) pyruvate toacetolactate; (b) acetolactate to 2,3-dihydroxyisovalerate; (c)2,3-dihydroxyisovalerate to α-ketoisovalerate; (d) α-ketoisovalerate toisobutyraldehyde; and (e) isobutyraldehyde to isobutanol; and whereinsaid microbial host cell produces isobutanol. In embodiments, (a) thepolypeptide that catalyzes a substrate to product conversion of pyruvateto acetolactate is acetolactate synthase having the EC number 2.2.1.6;(b) the polypeptide that catalyzes a substrate to product conversion ofacetolactate to 2,3-dihydroxyisovalerate is acetohydroxy acidisomeroreducatase having the EC number 1.1.186; (c) the polypeptide thatcatalyzes a substrate to product conversion of 2,3-dihydroxyisovalerateto alpha-ketoisovalerate is acetohydroxy acid dehydratase having the ECnumber 4.2.1.9; and (d) the polypeptide that catalyzes a substrate toproduct conversion of alpha-ketoisovalerate to isobutyraldehyde isbranched-chain alpha-keto acid decarboxylase having the EC number4.1.1.72. In embodiments, the biosynthetic pathway for production of alower alkyl alcohol is an isobutanol biosynthetic pathway comprisingheterologous polynucleotides encoding polypeptides that catalyzesubstrate to product conversions for each step of the following steps:(a) pyruvate to acetolactate; (b) acetolactate to2,3-dihydroxyisovalerate; (c) 2,3-dihydroxyisovalerate toα-ketoisovalerate; (d) α-ketoisovalerate to isobutyryl-CoA; (e)isobutyryl-CoA to isobutyraldehyde; and (f) isobutyraldehyde toisobutanol; and wherein said microbial host cell produces isobutanol. Inembodiments, the biosynthetic pathway for production of a lower alkylalcohol is an isobutanol biosynthetic pathway comprising heterologouspolynucleotides encoding polypeptides that catalyze substrate to productconversions for each step of the following steps: (a) pyruvate toacetolactate; (b) acetolactate to 2,3-dihydroxyisovalerate; (c)2,3-dihydroxyisovalerate to α-ketoisovalerate; (d) α-ketoisovalerate tovaline; (e) valine to isobutylamine; (e) isobutylamine toisobutyraldehyde; and (f) isobutyraldehyde to isobutanol; and whereinsaid microbial host cell produces isobutanol.

Also provided herein are recombinant microbial host cells comprising abiosynthetic pathway for the production of a lower alkyl alcohol and aheterologous polynucleotide encoding a polypeptide with alcoholdehydrogenase activity having at least 85% identity to the amino acidsequence of SEQ ID NO: 21, 22, 23, 24, 25, 31, 32, 34, 35, 36, 37, or38. In embodiments, the biosynthetic pathway for the production of alower alkyl alcohol is a 2-butanol biosynthetic pathway comprisingheterologous polynucleotides encoding polypeptides that catalyzesubstrate to product conversions for each of the following steps: (a)pyruvate to alpha-acetolactate; (b) alpha-acetolactate to acetoin; (c)acetoin to 2,3-butanediol; (d) 2,3-butanediol to 2-butanone; and (e)2-butanone to 2-butanol; and wherein said microbial host cell produces2-butanol. In embodiments, (a) the polypeptide that catalyzes asubstrate to product conversion of pyruvate to acetolactate isacetolactate synthase having the EC number 2.2.1.6; (b) the polypeptidethat catalyzes a substrate to product conversion of acetolactate toacetoin is acetolactate decarboxylase having the EC number 4.1.1.5; (c)the polypeptide that catalyzes a substrate to product conversion ofacetoin to 2,3-butanediol is butanediol dehydrogenase having the ECnumber 1.1.1.76 or EC number 1.1.1.4; (d) the polypeptide that catalyzesa substrate to product conversion of butanediol to 2-butanone isbutanediol dehydratase having the EC number 4.2.1.28; and (e) thepolypeptide that catalyzes a substrate to product conversion of2-butanone to 2-butanol is 2-butanol dehydrogenase having the EC number1.1.1.1. In embodiments, the polypeptide having alcohol dehydrogenaseactivity comprises an amino acid sequence with at least 95% identity tothe amino acid sequence of SEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32,34, 35, 36, 37, or 38. In embodiments, the polypeptide having alcoholdehydrogenase activity comprises an amino acid sequence with at least95% identity to the amino acid sequence of SEQ ID NO: 31.

In embodiments, the biosynthetic pathway for the production of a loweralkyl alcohol is a 1-butanol biosynthetic pathway comprises heterologouspolynucleotides encoding polypeptides that catalyze substrate to productconversions for each of the following steps: (a) acetyl-CoA toacetoacetyl-CoA; (b) acetoacetyl-CoA to 3-hydroxybutyryl-CoA; (c)3-hydroxybutyryl-CoA to crotonyl-CoA; (d) crotonyl-CoA to butyryl-CoA;(e) butyryl-CoA to butyraldehyde; and (f) butyraldehyde to 1-butanol;and wherein said microbial host cell produces 1-butanol. In embodiments,(a) the polypeptide that catalyzes a substrate to product conversion ofacetyl-CoA to acetoacetyl-CoA is acetyl-CoA acetyltransferase having theEC number 2.3.1.9 or 2.3.1.16; (b) the polypeptide that catalyzes asubstrate to product conversion of acetoacetyl-CoA to3-hydroxybutyryl-CoA is 3-hydroxybutyryl-CoA dehydrogenase having the ECnumber 1.1.1.35, 1.1.1.30, 1.1.1.157, or 1.1.1.36; (c) the polypeptidethat catalyzes a substrate to product conversion of 3-hydroxybutyryl-CoAto crotonyl-CoA is crotonase having the EC number 4.2.1.17 or 4.2.1.55;(d) the polypeptide that catalyzes a substrate to product conversion ofcrotonyl-CoA to butyryl-CoA is butyryl-CoA dehydrogenase having the ECnumber 1.3.1.44 or 1.3.1.38; (e) the polypeptide that catalyzes asubstrate to product conversion of butyryl-CoA to butyrylaldehyde isbutyraldehyde dehydrogenase having the EC number 1.2.1.57; and (f) thepolypeptide that catalyzes a substrate to product conversion ofbutyrylaldehyde to 1-butanol is 1-butanol dehydrogenase. In embodiments,the polypeptide having alcohol dehydrogenase activity comprises an aminoacid sequence with at least 95% identity to the amino acid sequence ofSEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37, or 38. Inembodiments, the polypeptide having alcohol dehydrogenase activitycomprises an amino acid sequence with at least 95% identity to the aminoacid sequence of SEQ ID NO: 31.

In embodiments, the recombinant microbial host cell is selected from thegroup consisting of: bacteria, cyanobacteria, filamentous fungi andyeasts. In embodiments, the host cell is a bacterial or cyanobacterialcell. In embodiments, the genus of the host cells is selected from thegroup consisting of: Salmonella, Arthrobacter, Bacillus, Brevibacterium,Clostridium, Corynebacterium, Gluconobacter, Nocardia, Pseudomonas,Rhodococcus, Streptomyces, Zymomonas, Escherichia, Lactobacillus,Enterococcus, Alcaligenes, Klebsiella, Serratia, Shigella, Alcaligenes,Erwinia, Paenibacillus, and Xanthomonas. In embodiments, the genus ofthe host cells provided herein is selected from the group consisting of:Saccharomyces, Pichia, Hansenula, Yarrowia, Aspergillus, Kluyveromyces,Pachysolen, Rhodotorula, Zygosaccharomyces, Galactomyces,Schizosaccharomyces, Torulaspora, Debayomyces, Williopsis, Dekkera,Kloeckera, Metschnikowia, Issatchenkia, and Candida.

Another aspect of the present invention is a method for producingisobutanol comprising: (a) providing a recombinant microbial host cellcomprising an isobutanol biosynthetic pathway, the pathway comprising aheterologous polypeptide which catalyzes the substrate to productconversion of isobutyraldehyde to isobutanol wherein the polypeptide hasat least 90% identity to the amino acid sequence of SEQ ID NO: 21, 22,23, 24, 25, 27, 31, 32, 34, 35, 36, 37, or 38; and (b) contacting thehost cell of (a) with a carbon substrate under conditions wherebyisobutanol is produced. In embodiments, the heterologous polypeptidewhich catalyzes the substrate to product conversion of isobutyraldehydeto isobutanol has at least 90% identity to the amino acid sequence ofSEQ ID NO: 31. Another aspect is a method for producing 2-butanolcomprising: (a) providing a recombinant microbial host cell comprising a2-butanol biosynthetic pathway, the pathway comprising a heterologouspolypeptide having at least 90% identity to the amino acid sequence ofSEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37, or 38; and(b) contacting the host cell of (a) with a carbon substrate underconditions whereby 2-butanol is produced. In embodiments, theheterologous polypeptide has at least 90% identity to the amino acidsequence of SEQ ID NO: 31. Another aspect is a method for producing1-butanol comprising: (a) providing a recombinant microbial host cellcomprising a 1-butanol biosynthetic pathway, the pathway comprising aheterologous polypeptide having at least 90% identity to the amino acidsequence of SEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37,or 38; and (b) contacting the host cell of (a) with a carbon substrateunder conditions whereby 1-butanol is produced. In embodiments, theheterologous polypeptide has at least 90% identity to the amino acidsequence of SEQ ID NO: 31.

Also provided herein are methods for the production of a lower alkylalcohol comprising: (a) providing a recombinant host cell providedherein; (b) contacting said host cell with a fermentable carbonsubstrate in a fermentation medium under conditions whereby the loweralkyl alcohol is produced; and (c) recovering said lower alkyl alcohol.In embodiments, said fermentable carbon substrate is selected from thegroup consisting of: monosaccharides, oligosaccharides, andpolysaccharides. In embodiments, monosaccharide is selected from thegroup consisting: glucose, galactose, mannose, rhamnose, xylose, andfructose. In embodiments, said oligosaccharide is selected from thegroup consisting of: sucrose, maltose, and lactose. In embodiments,polysaccharide is selected from the group consisting of: starch,cellulose, and maltodextrin. In embodiments, the conditions areanaerobic, aerobic, or microaerobic. In embodiments, said lower alkylalcohol is produced at a titer of at least about 10 g/L, at least about15 g/L, or at least about 20 g/L. In embodiments, said lower alkylalcohol is selected from the group consisting of: butanol, isobutanol,propanol, isopropanol, and ethanol.

In embodiments, isobutanol is produced. In embodiments, the method forproducing isobutanol comprises: (a) providing a recombinant host cellcomprising a heterologous polypeptide which catalyzes the substrate toproduct conversion of isobutyraldehyde to isobutanol and which has oneor more of the following characteristics: (i) the K_(M) value of a loweralkyl aldehyde is lower for the polypeptide relative to a controlpolypeptide having the amino acid sequence of SEQ ID NO: 26; (ii) theK_(I) value for a lower alkyl aldehyde for the polypeptide is higherrelative to control polypeptide having the amino acid sequence of SEQ IDNO: 26; (iii) the k_(cat)/K_(M) value for a lower alkyl aldehyde for thepolypeptide is higher relative to a control polypeptide having the aminoacid sequence of SEQ ID NO: 26; and (b) contacting the host cell of (a)with a carbon substrate under conditions whereby isobutanol is produced.

In embodiments, 1-butanol is produced. In embodiments, the method forproducing 1-butanol comprises: (a) providing a recombinant microbialhost cell comprising a heterologous polypeptide which catalyzes thesubstrate to product conversion of butyraldehyde to 1-butanol and whichhas one or more of the following characteristics: (i) the K_(M) valuefor a lower alkyl aldehyde is lower for the polypeptide relative to acontrol polypeptide having the amino acid sequence of SEQ ID NO: 26;(ii) the K_(I) value for a lower alkyl alcohol for the polypeptide ishigher relative to a control polypeptide having the amino acid sequenceof SEQ ID NO: 26; and (iii) the k_(cat)/K_(M) value for a lower alkylaldehyde for the polypeptide is higher relative to a control polypeptidehaving the amino acid sequence of SEQ ID NO: 26; and (b) contacting thehost cell of (a) with a carbon substrate under conditions whereby1-butanol is produced.

Also provided herein are methods for screening candidate polypeptideshaving alcohol dehydrogenase activity, said method comprising: a)providing a candidate polypeptide and a cofactor selected from the groupconsisting of NADH and NADPH; b) monitoring a change in A_(340 nm) overtime in the presence or absence of a lower alkyl alcohol for thecandidate polypeptide; and c) selecting those candidate polypeptideswhere the change in A_(340 nm) is a decrease, and the decrease is fasterin the absence of the lower alkyl alcohol with respect to the decreasein the presence of the lower alkyl alcohol. In embodiments, the methodsfurther comprise (d) providing a control polypeptide having the aminoacid sequence of either SEQ ID NO: 21 or 26 and NADH; (e) monitoring achange in A_(340 nm) over time in the presence or absence of a loweralkyl alcohol for the control polypeptide; (f) comparing the changesobserved in (e) with the changes observed in (b); and (g) selectingthose candidate polypeptides where the decrease in A_(340 nm) in theabsence of the lower alkyl alcohol is faster than the decrease observedfor the control polypeptide. In embodiments, the methods furthercomprise (d) providing a control polypeptide having the amino acidsequence of either SEQ ID NO: 21 or 26 and NADH; (e) monitoring a changein A_(340 nm) over time in the presence or absence of a lower alkylalcohol for the control polypeptide; (f) comparing the changes observedin (e) with the changes observed in (b); and (g) selecting thosecandidate polypeptides where the decrease in A_(340 nm) in the presenceof the lower alkyl alcohol is faster than the decrease observed for thecontrol polypeptide.

Also provided herein is use of an alcohol dehydrogenase having at leastabout 80% identity to an amino acid sequence of SEQ ID NO: 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 ina microbial host cell to catalyze the conversion of isobutyraldehyde toisobutanol; wherein said host cell comprises an isobutanol biosyntheticpathway.

BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES AND SEQUENCES

FIG. 1 shows the results of semi-physiological time-course assaysshowing isobutyraldehyde reduction by NAD(P)H, catalyzed by ADHcandidate enzymes in the presence and absence of isobutanol. Enzymaticactivity is measured by following changes in absorbance at 340 nm. Ineach panel, A_(340 nm) of NADH or NADPH alone, in the presence of allother reactants except the enzyme, was used as a control. Panel A showsthe change in absorbance at 340 nm over time for Achromobacterxylosoxidans SadB. Panel B shows the change in absorbance at 340 nm overtime for horse liver ADH. Panel C shows the change in absorbance at 340nm over time for Saccharomyces cerevisiae ADH6. Panel D shows the changein absorbance at 340 nm over time for Saccharomyces cerevisiae ADH7.Panel E shows the change in absorbance at 340 nm over time forBeijierickia indica ADH. Panel F shows the change in absorbance at 340nm over time for Clostridium beijerinckii ADH. Panel G shows the changein absorbance at 340 nm over time for Rattus norvegicus ADH. Panel Hshows the change in absorbance at 340 nm over time for Therm. sp. ATN1ADH.

FIG. 2 shows the results of semi-physiological time-course assayscomparing the level of isobutanol inhibition observed with horse liverADH and Achromobacter xylosoxidans SadB in the same figure. The assaysare as described for FIG. 1.

FIG. 3 is an alignment of the polypeptide sequences of Pseudomonasputida formaldehyde dehydrogenase (1kolA) (SEQ ID NO: 79), horse liverADH (2ohxA) (SEQ ID NO: 21), Clostridium beijerinckii ADH (1pedA) (SEQID NO: 29), Pyrococcus horikoshii L-theronine 3-dehydrogenase (2d8aA)(SEQ ID NO: 80), and Achromobacter xylosoxidans SadB (SEQ ID NO: 26).

FIG. 4 is a phylogenetic tree of oxidoreductase enzymes obtained as hitsfrom (i) a protein BLAST search for similar sequences in Saccharomycescerevisiae, E. coli, Homo sapiens, C. elegans, Drosophila melanogaster,and Arabidopsis thaliana, and (ii) a protein BLAST search of ProteinData Bank (PDB) for similar sequences using horse liver ADH andAchromobacter xylosoxidans SadB as queries.

FIG. 5 is a phylogenetic tree of oxidoreductase enzyme sequences moreclosely related in sequence to Achromobacter xylosoxidans SadB amonghits from a protein BLAST search of nonredundant protein sequencedatabase (nr) at NCBI using Achromobacter xylosoxidans SadB as query.

FIG. 6 is an illustration of example pyruvate to isobutanol biosyntheticpathways.

FIG. 7 shows the Michaelis-Menten plots describing the properties of theenzymes pertaining to isobutyraldehyde reduction. FIG. 7A shows resultsof assays to determine the K_(I) for isobutanol for ADH6 and FIG. 7Bshows results of assays to determine the K_(I) for isobutanol for BiADH.

FIG. 8A shows the results of semi-physiological time-course assays,which were as described for FIG. 1. Panel A shows the change inabsorbance at 340 nm over time for the ADH from Phenylobacteriumzucineum. Panel B shows the change in absorbance at 340 nm over time forMethylocella silvestris BL2. Panel C shows the change in absorbance at340 nm over time for Acinetobacter baumannii AYE.

FIG. 9 depicts the pdc1::ilvD::FBA-alsS::trx1 A locus. The alsS geneintegration in the pdc1-trx1 intergenic region is considered a“scarless” insertion since vector, marker gene and loxP sequences arelost.

The following sequences provided in the accompanying sequence listing,filed electronically herewith and incorporated herein by reference,conform with 37 C.F.R. 1.821-1.825 (“Requirements for PatentApplications Containing Nucleotide Sequences and/or Amino Acid SequenceDisclosures—the Sequence Rules”) and are consistent with WorldIntellectual Property Organization (WIPO) Standard ST.25 (2009) and thesequence listing requirements of the EPO and PCT (Rules 5.2 and49.5(a-bis), and Section 208 and Annex C of the AdministrativeInstructions). The symbols and format used for nucleotide and amino acidsequence data comply with the rules set forth in 37 C.F.R. §1.822.

SEQ ID NOs:1 and 7-20 are codon-optimized polynucleotide sequences.

SEQ ID NOs: 2 and 3 are polynucleotide sequences from Saccharomycescerevisiae.

SEQ ID NOs: 4 and 5 are polynucleotide sequences from Clostridiumacetobutylicum.

SEQ ID NO: 6 is a polynucleotide sequence from Achromobacterxylosoxidans.

SEQ ID NOs: 21-40 and 79-80 are polypeptide sequences.

SEQ ID NOs: 41-50 and 52-57 and 59-74 and 77-78 are primers.

SEQ ID NO: 51 is the sequence of the pRS423::TEF(M4)-xpk1+ENO1-eutDplasmid.

SEQ ID NO: 58 is the sequence of the pUC19-URA3::pdc1::TEF(M4)-xpk1::kanplasmid.

SEQ ID NO: 75 is the sequence of the pLH468 plasmid.

SEQ ID NO: 76 is the BiADH coding region (codon optimized for yeast)plus 5′homology to GPM promoter and 3′homology to ADH1 terminator.

SEQ ID NO: 81 is the sequence of the pRS426::GPD-xpk1+ADH-eutD plasmid.

DETAILED DESCRIPTION OF THE INVENTION

The stated problems are solved as described herein by devising and usinga suitable screening strategy for evaluating various candidate ADHenzymes. The screening strategy can be used to identify ADH enzymeshaving desirable characteristics. These identified ADH enzymes can beused to enhance the biological production of lower alkyl alcohols, suchas isobutanol. Also provided are recombinant host cells that express theidentified desirable ADH enzymes and provided methods for producinglower alkyl alcohols using the same.

The present invention describes a method for screening large numbers ofalcohol dehydrogenase (ADH) enzymes for their ability to rapidly convertisobutyraldehyde to isobutanol in the presence of high concentrations ofisobutanol. Also described in the present invention is a new ADH that ispresent in the bacterium Beijerinckia indica subspecies indica ATCC9039. The Beijerinckia indica ADH enzyme can be used in the productionof isobutanol from isobutyraldehyde in a recombinant microorganismhaving an isobutyraldehyde source.

The present invention meets a number of commercial and industrial needs.Butanol is an important industrial commodity chemical with a variety ofapplications, where its potential as a fuel or fuel additive isparticularly significant. Although only a four-carbon alcohol, butanolhas an energy content similar to that of gasoline and can be blendedwith any fossil fuel. Butanol is favored as a fuel or fuel additive asit yields only CO₂ and little or no SO₂ or NO₂ when burned in thestandard internal combustion engine. Additionally butanol is lesscorrosive than ethanol, the most preferred fuel additive to date.

In addition to its utility as a biofuel or fuel additive, butanol hasthe potential of impacting hydrogen distribution problems in theemerging fuel cell industry. Fuel cells today are plagued by safetyconcerns associated with hydrogen transport and distribution. Butanolcan be easily reformed for its hydrogen content and can be distributedthrough existing gas stations in the purity required for either fuelcells or vehicles.

The present invention produces butanol from plant derived carbonsources, avoiding the negative environmental impact associated withstandard petrochemical processes for butanol production. In oneembodiment, the present invention provides a method for the selectionand identification of ADH enzymes that increase the flux in the lastreaction of the isobutanol biosynthesis pathway; the conversion ofisobutyraldehyde to isobutanol. In one embodiment, the present inventionprovides a method for the selection and identification of ADH enzymesthat increase the flux in the last reaction of the 1-butanolbiosynthesis pathway; the conversion of butyrylaldehyde to 1-butanol. Inone embodiment, the present invention provides a method for theselection and identification of ADH enzymes that increase the flux inthe last reaction of the 2-butanol biosynthesis pathway; the conversionof 2-butanone to 2-butanol. Particularly useful ADH enzymes are thosethat are better able to increase the flux in the isobutyraldehyde toisobutanol conversion reaction when compared to known control ADHenzymes. The present invention also provides for recombinant host cellsexpressing such identified ADH enzymes and methods for using the same.

The following definitions and abbreviations are to be used for theinterpretation of the claims and the specification.

The term “invention” or “present invention” as used herein is meant toapply generally to all embodiments of the invention as described in theclaims as presented or as later amended and supplemented, or in thespecification.

The term “isobutanol biosynthetic pathway” refers to the enzymaticpathway to produce isobutanol from pyruvate.

The term “1-butanol biosynthetic pathway” refers to the enzymaticpathway to produce 1-butanol from pyruvate.

The term “2-butanol biosynthetic pathway” refers to the enzymaticpathway to produce 2-butanol from acetyl-CoA.

The term “NADH consumption assay” refers to an enzyme assay for thedetermination of the specific activity of the alcohol dehydrogenaseenzyme, which is measured as a stoichiometric disappearance of NADH, acofactor for the enzyme reaction, as described in Racker, J Biol. Chem.,184:313-319 (1950).

“ADH” is the abbreviation for the enzyme alcohol dehydrogenase.

The terms “isobutyraldehyde dehydrogenase,” “secondary alcoholdehydrogenase,” “butanol dehydrogenase,” “branched-chain alcoholdehydrogenase,” and “alcohol dehydrogenase” will be used interchangeablyand refer the enzyme having the EC number, EC 1.1.1.1 (EnzymeNomenclature 1992, Academic Press, San Diego). Preferred branched-chainalcohol dehydrogenases are known by the EC number 1.1.1.265, but mayalso be classified under other alcohol dehydrogenases (specifically, EC1.1.1.1 or 1.1.1.2). These enzymes utilize NADH (reduced nicotinamideadenine dinucleotide) and/or NADPH as an electron donor.

As used herein, “heterologous” refers to a polynucleotide, gene orpolypeptide not normally found in the host organism but that isintroduced or is otherwise modified. “Heterologous polynucleotide”includes a native coding region from the host organism, or portionthereof, that is reintroduced or otherwise modified in the host organismin a form that is different from the corresponding native polynucleotideas well as a coding region from a different organism, or portion thereof“Heterologous gene” includes a native coding region, or portion thereof,that is reintroduced or is otherwise modified from the source organismin a form that is different from the corresponding native gene as wellas a coding region from a different organism. For example, aheterologous gene may include a native coding region that is a portionof a chimeric gene including non-native regulatory regions that isreintroduced into the native host. “Heterologous polypeptide” includes anative polypeptide that is reintroduced or otherwise modified in thehost organism in a form that is different from the corresponding nativepolypeptide as well as a polypeptide from another organism.

The term “carbon substrate” or “fermentable carbon substrate” refers toa carbon source capable of being metabolized by host organisms of thepresent invention. Non-limited examples of carbon sources that can beused in the invention include monosaccharides, oligosaccharides,polysaccharides, and one-carbon substrates or mixtures thereof.

The terms “k_(cat)” and “K_(M)” and K_(I)″ are known to those skilled inthe art and are described in Enzyme Structure and Mechanism, 2nd ed.(Ferst, W.H. Freeman: NY, 1985; pp 98-120). The term “k_(cat)” oftencalled the “turnover number,” is defined as the maximum number ofsubstrate molecules converted to product molecules per active site perunit time, or the number of times the enzyme turns over per unit time.k_(cat)=V_(max)/[E], where [E] is the enzyme concentration (Ferst,supra).

The term “catalytic efficiency” is defined as the k_(cat)/K_(M) of anenzyme. “Catalytic efficiency” is used to quantitate the specificity ofan enzyme for a substrate.

The term “specific activity” means enzyme units/mg protein where anenzyme unit is defined as moles of product formed/minute under specifiedconditions of temperature, pH, [S], etc.

The terms “slow,” “slower,” “faster,” or “fast” when used in referenceto an enzyme activity relates to the turnover number of the enzyme ascompared with a standard.

The term “control polypeptide” refers to a known polypeptide havingknown alcohol dehydrogenase activity. Non-limiting examples of controlpolypeptides suitable for use in the invention include Achromobacterxylosoxidans SadB and horse liver ADH.

The term “lower alkyl alcohol” refers to any straight-chain or branched,saturated or unsaturated, alcohol molecule with 1-10 carbon atoms.

The term “lower alkyl aldehyde” refers to any straight-chain orbranched, saturated or unsaturated, aldehyde molecule with 1-10 carbonatoms.

The term “butanol” as used herein refers to 1-butanol, 2-butanol,isobutanol, or mixtures thereof.

The term “biosynthetic pathway for production of a lower alkyl alcohol”as used herein refers to an enzyme pathway to produce lower alkylalcohols. For example, isobutanol biosynthetic pathways are disclosed inU.S. Patent Application Publication No. 2007/0092957, which isincorporated by reference herein.

As used herein, the term “yield” refers to the amount of product peramount of carbon source in g/g. The yield may be exemplified for glucoseas the carbon source. It is understood unless otherwise noted that yieldis expressed as a percentage of the theoretical yield. In reference to amicroorganism or metabolic pathway, “theoretical yield” is defined asthe maximum amount of product that can be generated per total amount ofsubstrate as dictated by the stoichiometry of the metabolic pathway usedto make the product. For example, the theoretical yield for one typicalconversion of glucose to isopropanol is 0.33 mg. As such, a yield ofisopropanol from glucose of 29.7 mg would be expressed as 90% oftheoretical or 90% theoretical yield. It is understood that while in thepresent disclosure the yield is exemplified for glucose as a carbonsource, the invention can be applied to other carbon sources and theyield may vary depending on the carbon source used. One skilled in theart can calculate yields on various carbon sources. The term “NADH”means reduced nicotinamide adenine dinucleotide.

The term “NADPH” means reduced nicotinamide adenine dinucleotidephosphate.

The term “NAD(P)H” is used to refer to either NADH or NADPH.

Polypeptides and Polynucleotides for Use in the Invention

The ADH enzymes used in the invention comprise polypeptides andfragments thereof. As used herein, term “polypeptide” is intended toencompass a singular “polypeptide” as well as plural “polypeptides,” andrefers to a molecule composed of monomers (amino acids) linearly linkedby amide bonds (also known as peptide bonds). The term “polypeptide”refers to any chain or chains of two or more amino acids, and does notrefer to a specific length of the product. Thus, peptides, dipeptides,tripeptides, oligopeptides, “protein,” “amino acid chain,” or any otherterm used to refer to a chain or chains of two or more amino acids, areincluded within the definition of “polypeptide,” and the term“polypeptide” may be used instead of, or interchangeably with any ofthese terms. The term “polypeptide” is also intended to refer to theproducts of post-expression modifications of the polypeptide, includingwithout limitation glycosylation, acetylation, phosphorylation,amidation, derivatization by known protecting/blocking groups,proteolytic cleavage, or modification by non-naturally occurring aminoacids.

A polypeptide of the invention may be of a size of about 10 or more, 20or more, 25 or more, 50 or more, 75 or more, 100 or more, 200 or more,500 or more, 1,000 or more, or 2,000 or more amino acids. Polypeptidesmay have a defined three-dimensional structure, although they do notnecessarily have such structure. Polypeptides with a definedthree-dimensional structure are referred to as folded, and polypeptideswhich do not possess a defined three-dimensional structure, but rathercan adopt a large number of different conformations, and are referred toas unfolded.

Also included as polypeptides of the present invention are derivatives,analogs, or variants of the foregoing polypeptides, and any combinationthereof. The terms “active variant,” “active fragment,” “activederivative,” and “analog” refer to polypeptides of the present inventionand include any polypeptides that are capable of catalyzing thereduction of a lower alkyl aldehyde. Variants of polypeptides of thepresent invention include polypeptides with altered amino acid sequencesdue to amino acid substitutions, deletions, and/or insertions. Variantsmay occur naturally or be non-naturally occurring. Non-naturallyoccurring variants may be produced using art-known mutagenesistechniques. Variant polypeptides may comprise conservative ornon-conservative amino acid substitutions, deletions and/or additions.Derivatives of polypeptides of the present invention, are polypeptideswhich have been altered so as to exhibit additional features not foundon the native polypeptide. Examples include fusion proteins. Variantpolypeptides may also be referred to herein as “polypeptide analogs.” Asused herein a “derivative” of a polypeptide refers to a subjectpolypeptide having one or more residues chemically derivatized byreaction of a functional side group. Also included as “derivatives” arethose peptides which contain one or more naturally occurring amino acidderivatives of the twenty standard amino acids. For example,4-hydroxyproline may be substituted for proline; 5-hydroxylysine may besubstituted for lysine; 3-methylhistidine may be substituted forhistidine; homoserine may be substituted for serine; and ornithine maybe substituted for lysine.

A “fragment” is a unique portion of an ADH enzyme which is identical insequence to but shorter in length than the parent sequence. A fragmentmay comprise up to the entire length of the defined sequence, minus oneamino acid residue. For example, a fragment may comprise from 5 to 1000contiguous amino acid residues. A fragment may be at least 5, 10, 15,16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguousamino acid residues in length. Fragments may be preferentially selectedfrom certain regions of a molecule. For example, a polypeptide fragmentmay comprise a certain length of contiguous amino acids selected fromthe first 100 or 200 amino acids of a polypeptide as shown in a certaindefined sequence. Clearly these lengths are exemplary, and any lengththat is supported by the specification, including the Sequence Listing,tables, and figures, may be encompassed by the present embodiments.

Alternatively, recombinant variants encoding these same or similarpolypeptides can be synthesized or selected by making use of the“redundancy” in the genetic code. Various codon substitutions, such asthe silent changes which produce various restriction sites, may beintroduced to optimize cloning into a plasmid or viral vector orexpression in a host cell system. Mutations in the polynucleotidesequence may be reflected in the polypeptide or domains of otherpeptides added to the polypeptide to modify the properties of any partof the polypeptide, to change characteristics such as the K_(M) for alower alkyl aldehyde, the K_(M) for a lower alkyl alcohol, the K_(I) fora lower alkyl alcohol, etc.

Preferably, amino acid “substitutions” are the result of replacing oneamino acid with another amino acid having similar structural and/orchemical properties, i.e., conservative amino acid replacements.“Conservative” amino acid substitutions may be made on the basis ofsimilarity in polarity, charge, solubility, hydrophobicity,hydrophilicity, and/or the amphipathic nature of the residues involved.For example, nonpolar (hydrophobic) amino acids include alanine,leucine, isoleucine, valine, proline, phenylalanine, tryptophan, andmethionine; polar neutral amino acids include glycine, serine,threonine, cysteine, tyrosine, asparagine, and glutamine; positivelycharged (basic) amino acids include arginine, lysine, and histidine; andnegatively charged (acidic) amino acids include aspartic acid andglutamic acid. “Insertions” or “deletions” are preferably in the rangeof about 1 to about 20 amino acids, more preferably 1 to 10 amino acids.The variation allowed may be experimentally determined by systematicallymaking insertions, deletions, or substitutions of amino acids in apolypeptide molecule using recombinant DNA techniques and assaying theresulting recombinant variants for activity.

By a polypeptide having an amino acid or polypeptide sequence at least,for example, 95% “identical” to a query amino acid sequence of thepresent invention, it is intended that the amino acid sequence of thesubject polypeptide is identical to the query sequence except that thesubject polypeptide sequence may include up to five amino acidalterations per each 100 amino acids of the query amino acid sequence.In other words, to obtain a polypeptide having an amino acid sequence atleast 95% identical to a query amino acid sequence, up to 5% of theamino acid residues in the subject sequence may be inserted, deleted, orsubstituted with another amino acid. These alterations of the referencesequence may occur at the amino or carboxy terminal positions of thereference amino acid sequence or anywhere between those terminalpositions, interspersed either individually among residues in thereference sequence or in one or more contiguous groups within thereferences sequence.

As a practical matter, whether any particular polypeptide is at least80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to a referencepolypeptide can be determined conventionally using known computerprograms. A preferred method for determining the best overall matchbetween a query sequence (a sequence of the present invention) and asubject sequence, also referred to as a global sequence alignment, canbe determined using the FASTDB computer program based on the algorithmof Brutlag et al., Comp. Appl. Biosci. 6:237-245 (1990). In a sequencealignment the query and subject sequences are either both nucleotidesequences or both amino acid sequences. The result of said globalsequence alignment is in percent identity. Preferred parameters used ina FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, MismatchPenalty=1, Joining Penalty=20, Randomization Group Length=0, CutoffScore=1, Window Size=sequence length, Gap Penalty=5, Gap SizePenalty-0.05, Window Size=500 or the length of the subject amino acidsequence, whichever is shorter.

If the subject sequence is shorter than the query sequence due to N- orC-terminal deletions, not because of internal deletions, a manualcorrection must be made to the results. This is because the FASTDBprogram does not account for N- and C-terminal truncations of thesubject sequence when calculating global percent identity. For subjectsequences truncated at the N- and C-termini, relative to the querysequence, the percent identity is corrected by calculating the number ofresidues of the query sequence that are N- and C-terminal of the subjectsequence, which are not matched/aligned with a corresponding subjectresidue, as a percent of the total bases of the query sequence. Whethera residue is matched/aligned is determined by results of the FASTDBsequence alignment. This percentage is then subtracted from the percentidentity, calculated by the above FASTDB program using the specifiedparameters, to arrive at a final percent identity score. This finalpercent identity score is what is used for the purposes of the presentinvention. Only residues to the N- and C-termini of the subjectsequence, which are not matched/aligned with the query sequence, areconsidered for the purposes of manually adjusting the percent identityscore. That is, only query residue positions outside the farthest N- andC-terminal residues of the subject sequence.

For example, a 90 amino acid residue subject sequence is aligned with a100 residue query sequence to determine percent identity. The deletionoccurs at the N-terminus of the subject sequence and therefore, theFASTDB alignment does not show a matching/alignment of the first 10residues at the N-terminus. The 10 unpaired residues represent 10% ofthe sequence (number of residues at the N- and C-termini notmatched/total number of residues in the query sequence) so 10% issubtracted from the percent identity score calculated by the FASTDBprogram. If the remaining 90 residues were perfectly matched the finalpercent identity would be 90%. In another example, a 90 residue subjectsequence is compared with a 100 residue query sequence. This time thedeletions are internal deletions so there are no residues at the N- orC-termini of the subject sequence which are not matched/aligned with thequery. In this case, the percent identity calculated by FASTDB is notmanually corrected. Once again, only residue positions outside the N-and C-terminal ends of the subject sequence, as displayed in the FASTDBalignment, which are not matched/aligned with the query sequence aremanually corrected for. No other manual corrections are to be made forthe purposes of the present invention.

Polypeptides useful in the invention include those that are at leastabout 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to thesequences set forth in Table 5, including active variants, fragments, orderivatives thereof. The invention also encompasses polypeptidescomprising amino acid sequences of Table 5 with conservative amino acidsubstitutions.

In one embodiment of the invention, polypeptides having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention have amino acid sequences that are at least about 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NO: 21, SEQ ID NO:22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ IDNO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQID NO: 32, SEQ ID NO: 33, SEQ ID NO 34, SEQ ID NO: 35, SEQ ID NO: 36,SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, and SEQ ID NO: 40. Inanother embodiment of the invention, a polypeptide having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention has an amino acid sequence selected from the groupconsisting of: SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQID NO 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38,SEQ ID NO: 39, and SEQ ID NO: 40, or an active variant, fragment orderivative thereof. In one embodiment, polypeptides having alcoholdehydrogenase activity are encoded by polynucleotides that have beencodon-optimized for expression in a specific host cell.

In one embodiment of the invention, polypeptides having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention comprise a amino acid sequence having at least 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 22. In anotherembodiment, the polypeptide comprises the amino acid sequence of SEQ IDNO: 22 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, polypeptides having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention comprise a amino acid sequence having at least 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 23. In anotherembodiment, the polypeptide comprises the amino acid sequence of SEQ IDNO: 23 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, polypeptides having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention comprise a amino acid sequence having at least 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 31. In anotherembodiment, the polypeptide comprises the amino acid sequence of SEQ IDNO: 31 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, polypeptides having alcoholdehydrogenase activity to be expressed in the recombinant host cells ofthe invention comprise a amino acid sequence having at least 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 29. In anotherembodiment, the polypeptide comprises the amino acid sequence of SEQ IDNO: 29 or an active variant, fragment or derivative thereof.

ADH enzymes suitable for use in the present invention and fragmentsthereof are can be encoded by polynucleotides. The term “polynucleotide”is intended to encompass a singular nucleic acid as well as pluralnucleic acids, and refers to an isolated nucleic acid molecule orconstruct, e.g., messenger RNA (mRNA), virally-derived RNA, or plasmidDNA (pDNA). A polynucleotide may comprise a conventional phosphodiesterbond or a non-conventional bond (e.g., an amide bond, such as found inpeptide nucleic acids (PNA)). The term “nucleic acid” refers to any oneor more nucleic acid segments, e.g., DNA or RNA fragments, present in apolynucleotide. Polynucleotides according to the present inventionfurther include such molecules produced synthetically. Polynucleotidesof the invention may be native to the host cell or heterologous. Inaddition, a polynucleotide or a nucleic acid may be or may include aregulatory element such as a promoter, ribosome binding site, or atranscription terminator.

As used herein, a “coding region” or “ORF” is a portion of nucleic acidwhich consists of codons translated into amino acids. Although a “stopcodon” (TAG, TGA, or TAA) is not translated into an amino acid, it maybe considered to be part of a coding region, if present, but anyflanking sequences, for example promoters, ribosome binding sites,transcriptional terminators, introns, 5′ and 3′ non-translated regions,and the like, are not part of a coding region.

The term “promoter” refers to a DNA sequence capable of controlling theexpression of a coding sequence or functional RNA. In general, a codingsequence is located 3′ to a promoter sequence. Promoters may be derivedin their entirety from a native gene, or be composed of differentelements derived from different promoters found in nature, or evencomprise synthetic DNA segments. It is understood by those skilled inthe art that different promoters may direct the expression of a gene indifferent tissues or cell types, or at different stages of development,or in response to different environmental or physiological conditions.Promoters which cause a gene to be expressed in most cell types at mosttimes are commonly referred to as “constitutive promoters.” It isfurther recognized that since in most cases the exact boundaries ofregulatory sequences have not been completely defined, DNA fragments ofdifferent lengths may have identical promoter activity.

In certain embodiments, the polynucleotide or nucleic acid is DNA. Inthe case of DNA, a polynucleotide comprising a nucleic acid, whichencodes a polypeptide normally may include a promoter and/or othertranscription or translation control elements operably associated withone or more coding regions. An operable association is when a codingregion for a gene product, e.g., a polypeptide, is associated with oneor more regulatory sequences in such a way as to place expression of thegene product under the influence or control of the regulatorysequence(s). Two DNA fragments (such as a polypeptide coding region anda promoter associated therewith) are “operably associated” if inductionof promoter function results in the transcription of mRNA encoding thedesired gene product and if the nature of the linkage between the twoDNA fragments does not interfere with the ability of the expressionregulatory sequences to direct the expression of the gene product orinterfere with the ability of the DNA template to be transcribed. Thus,a promoter region would be operably associated with a nucleic acidencoding a polypeptide if the promoter was capable of affectingtranscription of that nucleic acid. Other transcription controlelements, besides a promoter, for example enhancers, operators,repressors, and transcription termination signals, can be operablyassociated with the polynucleotide. Suitable promoters and othertranscription control regions are disclosed herein.

A variety of translation control elements are known to those of ordinaryskill in the art. These include, but are not limited to ribosome bindingsites, translation initiation and termination codons, and elementsderived from viral systems (particularly an internal ribosome entrysite, or IRES, also referred to as a CITE sequence).

In other embodiments, a polynucleotide of the present invention is RNA,for example, in the form of messenger RNA (mRNA). RNA of the presentinvention may be single stranded or double stranded.

Polynucleotide and nucleic acid coding regions of the present inventionmay be associated with additional coding regions which encode secretoryor signal peptides, which direct the secretion of a polypeptide encodedby a polynucleotide of the present invention.

As used herein, the term “transformation” refers to the transfer of anucleic acid fragment into the genome of a host organism, resulting ingenetically stable inheritance. Host organisms containing thetransformed nucleic acid fragments are referred to as “recombinant” or“transformed” organisms.

The term “expression,” as used herein, refers to the transcription andstable accumulation of sense (mRNA) or antisense RNA derived from thenucleic acid fragment of the invention. Expression may also refer totranslation of mRNA into a polypeptide.

The terms “plasmid,” “vector,” and “cassette” refer to an extrachromosomal element often carrying genes which are not part of thecentral metabolism of the cell, and usually in the form of circulardouble-stranded DNA fragments. Such elements may be autonomouslyreplicating sequences, genome integrating sequences, phage or nucleotidesequences, linear or circular, of a single- or double-stranded DNA orRNA, derived from any source, in which a number of nucleotide sequenceshave been joined or recombined into a unique construction which iscapable of introducing a promoter fragment and DNA sequence for aselected gene product along with appropriate 3′ untranslated sequenceinto a cell. “Transformation cassette” refers to a specific vectorcontaining a foreign gene and having elements in addition to the foreigngene that facilitates transformation of a particular host cell.“Expression cassette” refers to a specific vector containing a foreigngene and having elements in addition to the foreign gene that allow forenhanced expression of that gene in a foreign host.

The term “artificial” refers to a synthetic, or non-host cell derivedcomposition, e.g., a chemically-synthesized oligonucleotide.

By a nucleic acid or polynucleotide having a nucleotide sequence atleast, for example, 95% “identical” to a reference nucleotide sequenceof the present invention, it is intended that the nucleotide sequence ofthe polynucleotide is identical to the reference sequence except thatthe polynucleotide sequence may include up to five point mutations pereach 100 nucleotides of the reference nucleotide sequence. In otherwords, to obtain a polynucleotide having a nucleotide sequence at least95% identical to a reference nucleotide sequence, up to 5% of thenucleotides in the reference sequence may be deleted or substituted withanother nucleotide, or a number of nucleotides up to 5% of the totalnucleotides in the reference sequence may be inserted into the referencesequence.

As a practical matter, whether any particular nucleic acid molecule orpolypeptide is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%identical to a nucleotide sequence or polypeptide sequence of thepresent invention can be determined conventionally using known computerprograms. A preferred method for determining the best overall matchbetween a query sequence (a sequence of the present invention) and asubject sequence, also referred to as a global sequence alignment, canbe determined using the FASTDB computer program based on the algorithmof Brutlag et al., Comp. Appl. Biosci. 6:237-245 (1990). In a sequencealignment the query and subject sequences are both DNA sequences. An RNAsequence can be compared by converting U's to T's. The result of saidglobal sequence alignment is in percent identity. Preferred parametersused in a FASTDB alignment of DNA sequences to calculate percentidentity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=1, JoiningPenalty-30, Randomization Group Length=0, Cutoff Score=1, Gap Penalty=5,Gap Size Penalty=0.05, Window Size=500 or the length of the subjectnucleotide sequences, whichever is shorter.

If the subject sequence is shorter than the query sequence because of 5′or 3′ deletions, not because of internal deletions, a manual correctionmust be made to the results. This is because the FASTDB program does notaccount for 5′ and 3′ truncations of the subject sequence whencalculating percent identity. For subject sequences truncated at the 5′or 3′ ends, relative to the query sequence, the percent identity iscorrected by calculating the number of bases of the query sequence thatare 5′ and 3′ of the subject sequence, which are not matched/aligned, asa percent of the total bases of the query sequence. Whether a nucleotideis matched/aligned is determined by results of the FASTDB sequencealignment. This percentage is then subtracted from the percent identity,calculated by the above FASTDB program using the specified parameters,to arrive at a final percent identity score. This corrected score iswhat is used for the purposes of the present invention. Only basesoutside the 5′ and 3′ bases of the subject sequence, as displayed by theFASTDB alignment, which are not matched/aligned with the query sequence,are calculated for the purposes of manually adjusting the percentidentity score.

For example, a 90 base subject sequence is aligned to a 100 base querysequence to determine percent identity. The deletions occur at the 5′end of the subject sequence and therefore, the FASTDB alignment does notshow a matched/alignment of the first 10 bases at 5′ end. The 10unpaired bases represent 10% of the sequence (number of bases at the 5′and 3′ ends not matched/total number of bases in the query sequence) so10% is subtracted from the percent identity score calculated by theFASTDB program. If the remaining 90 bases were perfectly matched thefinal percent identity would be 90%. In another example, a 90 basesubject sequence is compared with a 100 base query sequence. This timethe deletions are internal deletions so that there are no bases on the5′ or 3′ of the subject sequence which are not matched/aligned with thequery. In this case the percent identity calculated by FASTDB is notmanually corrected. Once again, only bases 5′ and 3′ of the subjectsequence which are not matched/aligned with the query sequence aremanually corrected for. No other manual corrections are to be made forthe purposes of the present invention.

Polynucleotides useful in the invention include those that are at leastabout 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to thenucleotide sequences set forth in Table 4, below, including variants,fragments or derivatives thereof that encode polypeptides with activealcohol dehydrogenase activity.

The terms “active variant,” “active fragment,” “active derivative,” and“analog” refer to polynucleotides of the present invention and includeany polynucleotides that encode polypeptides capable of catalyzing thereduction of a lower alkyl aldehyde. Variants of polynucleotides of thepresent invention include polynucleotides with altered nucleotidesequences due to base pair substitutions, deletions, and/or insertions.Variants may occur naturally or be non-naturally occurring.Non-naturally occurring variants may be produced using art-knownmutagenesis techniques. Derivatives of polynucleotides of the presentinvention, are polynucleotides which have been altered so that thepolypeptides they encode exhibit additional features not found on thenative polypeptide. Examples include polynucleotides that encode fusionproteins. Variant polynucleotides may also be referred to herein as“polynucleotide analogs.” As used herein a “derivative” of apolynucleotide refers to a subject polynucleotide having one or morenucleotides chemically derivatized by reaction of a functional sidegroup. Also included as “derivatives” are those polynucleotides whichcontain one or more naturally occurring nucleotide derivatives. Forexample, 3-methylcytidine may be substituted for cytosine; ribothymidinemay be substituted for thymidine; and N4-acetylcytidine may besubstituted for cytosine.

A “fragment” is a unique portion of the polynucleotide encoding the ADHenzyme which is identical in sequence to but shorter in length than theparent sequence. A fragment may comprise up to the entire length of thedefined sequence, minus one nucleotide. For example, a fragment maycomprise from 5 to 1000 contiguous nucleotides. A fragment used as aprobe, primer, or for other purposes, may be at least 5, 10, 15, 16, 20,25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguousnucleotides. Fragments may be preferentially selected from certainregions of a molecule. For example, a polynucleotide fragment maycomprise a certain length of contiguous nucleotides selected from thefirst 100 or 200 nucleotides of a polynucleotide as shown in a certaindefined sequence. Clearly these lengths are exemplary, and any lengththat is supported by the specification, including the Sequence Listing,tables, and figures, may be encompassed by the present embodiments.

In one embodiment of the invention, polynucleotide sequences suitablefor expression in recombinant host cells of the invention comprisenucleotide sequences that are at least about 80%, 85%, 90%, 95%, 96%,97%, 98%, 99% or 100% identical to SEQ ID NO: 1, SEQ ID NO: 2, SEQ IDNO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, and SEQ ID NO: 20. In another embodiment ofthe invention, a polynucleotide sequence suitable for expression inrecombinant host cells of the invention can be selected from the groupconsisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, and SEQ ID NO: 20 or an active variant, fragment or derivativethereof. In one embodiment, polynucleotides have been codon-optimizedfor expression in a specific host cell.

In one embodiment of the invention, the polynucleotide sequence suitablefor expression in recombinant host cells of the invention has anucleotide sequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%identity to SEQ ID NO: 2. In another embodiment, the polynucleotidecomprises the nucleotide sequence of SEQ ID NO: 2 or an active variant,fragment or derivative thereof.

In one embodiment of the invention, the polynucleotide sequence suitablefor expression in recombinant host cells of the invention has anucleotide sequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%identity to SEQ ID NO: 3. In another embodiment, the polynucleotidecomprises the nucleotide sequence of SEQ ID NO: 3 or an active variant,fragment or derivative thereof.

In one embodiment of the invention, the polynucleotide sequence suitablefor expression in recombinant host cells of the invention has anucleotide sequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%identity to SEQ ID NO: 11. In another embodiment, the polynucleotidecomprises the nucleotide sequence of SEQ ID NO: 11 or an active variant,fragment or derivative thereof.

In one embodiment of the invention, the polynucleotide sequence suitablefor expression in recombinant host cells of the invention has anucleotide sequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%identity to SEQ ID NO: 9. In another embodiment, the polynucleotidecomprises the nucleotide sequence of SEQ ID NO: 9 or an active variant,fragment or derivative thereof.

As used herein the term “codon degeneracy” refers to the nature in thegenetic code permitting variation of the nucleotide sequence withoutaffecting the amino acid sequence of an encoded polypeptide. The skilledartisan is well aware of the “codon-bias” exhibited by a specific hostcell in usage of nucleotide codons to specify a given amino acid.Therefore, when synthesizing a gene for improved expression in a hostcell, it is desirable to design the gene such that its frequency ofcodon usage approaches the frequency of preferred codon usage of thehost cell.

As used herein the term “codon optimized coding region” means a nucleicacid coding region that has been adapted for expression in the cells ofa given organism by replacing at least one, or more than one, or asignificant number, of codons with one or more codons that are morefrequently used in the genes of that organism.

Deviations in the nucleotide sequence that comprise the codons encodingthe amino acids of any polypeptide chain allow for variations in thesequence coding for the gene. Since each codon consists of threenucleotides, and the nucleotides comprising DNA are restricted to fourspecific bases, there are 64 possible combinations of nucleotides, 61 ofwhich encode amino acids (the remaining three codons encode signalsending translation). The “genetic code” which shows which codons encodewhich amino acids is reproduced herein as Table 1. As a result, manyamino acids are designated by more than one codon. For example, theamino acids alanine and proline are coded for by four triplets, serineand arginine by six, whereas tryptophan and methionine are coded by justone triplet. This degeneracy allows for DNA base composition to varyover a wide range without altering the amino acid sequence of theproteins encoded by the DNA.

TABLE 1 The Standard Genetic Code T C A G T TTT Phe (F) TCT Ser (S) TATTyr (Y) TGT Cys (C) TTC Phe (F) TCC Ser (S) TAC Tyr (Y) TGC TTA Leu (L)TCA Ser (S) TAA Ter TGA Ter TTG Leu (L) TCG Ser (S) TAG Ter TGG Trp (W)C CTT Leu (L) CCT Pro (P) CAT His (H) CGT Arg (R) CTC Leu (L) CCC Pro(P) CAC His (H) CGC Arg (R) CTA Leu (L) CCA Pro (P) CAA Gln (Q) CGA Arg(R) CTG Leu (L) CCG Pro (P) CAG Gln (Q) CGG Arg (R) A ATT Ile (I) ACTThr (T) AAT Asn (N) AGT Ser (S) ATC Ile (I) ACC Thr (T) AAC Asn (N) AGCSer (S) ATA Ile (I) ACA Thr (T) AAA Lys (K) AGA Arg (R) ATG Met ACG Thr(T) AAG Lys (K) AGG Arg (R) (M) G GTT Val (V) GCT Ala (A) GAT Asp (D)GGT Gly (G) GTC Val (V) GCC Ala (A) GAC Asp (D) GGC Gly (G) GTA Val (V)GCA Ala (A) GAA Glu (E) GGA Gly (G) GTG Val (V) GCG Ala (A) GAG Glu (E)GGG Gly (G)

Many organisms display a bias for use of particular codons to code forinsertion of a particular amino acid in a growing peptide chain. Codonpreference or codon bias, differences in codon usage between organisms,is afforded by degeneracy of the genetic code, and is well documentedamong many organisms. Codon bias often correlates with the efficiency oftranslation of messenger RNA (mRNA), which is in turn believed to bedependent on, inter alia, the properties of the codons being translatedand the availability of particular transfer RNA (tRNA) molecules. Thepredominance of selected tRNAs in a cell is generally a reflection ofthe codons used most frequently in peptide synthesis. Accordingly, genescan be tailored for optimal gene expression in a given organism based oncodon optimization.

Given the large number of gene sequences available for a wide variety ofanimal, plant and microbial species, it is possible to calculate therelative frequencies of codon usage. Codon usage tables are readilyavailable, for example, at the “Codon Usage Database” available athttp://www.kazusa.or.jp/codon/, and these tables can be adapted in anumber of ways. See Nakamura, Y., et al. Nucl. Acids Res. 28:292 (2000).Codon usage tables for yeast, calculated from GenBank Release 128.0 [15Feb. 2002], are reproduced below as Table 2. This table uses mRNAnomenclature, and so instead of thymine (T) which is found in DNA, thetables use uracil (U) which is found in RNA. The Table has been adaptedso that frequencies are calculated for each amino acid, rather than forall 64 codons.

TABLE 2 Codon Usage Table for Saccharomyces cerevisiae Genes Frequencyper Amino Acid Codon Number thousand Phe UUU 170666 26.1 Phe UUC 12051018.4 Total Leu UUA 170884 26.2 Leu UUG 177573 27.2 Leu CUU 80076 12.3Leu CUC 35545 5.4 Leu CUA 87619 13.4 Leu CUG 68494 10.5 Total Ile AUU196893 30.1 Ile AUC 112176 17.2 Ile AUA 116254 17.8 Total Met AUG 13680520.9 Total Val GUU 144243 22.1 Val GUC 76947 11.8 Val GUA 76927 11.8 ValGUG 70337 10.8 Total Ser UCU 153557 23.5 Ser UCC 92923 14.2 Ser UCA122028 18.7 Ser UCG 55951 8.6 Ser AGU 92466 14.2 Ser AGC 63726 9.8 TotalPro CCU 88263 13.5 Pro CCC 44309 6.8 Pro CCA 119641 18.3 Pro CCG 345975.3 Total Thr ACU 132522 20.3 Thr ACC 83207 12.7 Thr ACA 116084 17.8 ThrACG 52045 8.0 Total Ala GCU 138358 21.2 Ala GCC 82357 12.6 Ala GCA105910 16.2 Ala GCG 40358 6.2 Total Tyr UAU 122728 18.8 Tyr UAC 9659614.8 Total His CAU 89007 13.6 His CAC 50785 7.8 Total Gln CAA 17825127.3 Gln CAG 79121 12.1 Total Asn AAU 233124 35.7 Asn AAC 162199 24.8Total Lys AAA 273618 41.9 Lys AAG 201361 30.8 Total Asp GAU 245641 37.6Asp GAC 132048 20.2 Total Glu GAA 297944 45.6 Glu GAG 125717 19.2 TotalCys UGU 52903 8.1 Cys UGC 31095 4.8 Total Trp UGG 67789 10.4 Total ArgCGU 41791 6.4 Arg CGC 16993 2.6 Arg CGA 19562 3.0 Arg CGG 11351 1.7 ArgAGA 139081 21.3 Arg AGG 60289 9.2 Total Gly GGU 156109 23.9 Gly GGC63903 9.8 Gly GGA 71216 10.9 Gly GGG 39359 6.0 Total Stop UAA 6913 1.1Stop UAG 3312 0.5 Stop UGA 4447 0.7

By utilizing this or similar tables, one of ordinary skill in the artcan apply the frequencies to any given polypeptide sequence, and producea nucleic acid fragment of a codon-optimized coding region which encodesthe polypeptide, but which uses codons optimal for a given species.

Randomly assigning codons at an optimized frequency to encode a givenpolypeptide sequence, can be done manually by calculating codonfrequencies for each amino acid, and then assigning the codons to thepolypeptide sequence randomly. Additionally, various algorithms andcomputer software programs are readily available to those of ordinaryskill in the art. For example, the “EditSeq” function in the LasergenePackage, available from DNAstar, Inc., Madison, Wis., thebacktranslation function in the VectorNTl Suite, available fromInforMax, Inc., Bethesda, Md., and the “backtranslate” function in theGCG—Wisconsin Package, available from Accelrys, Inc., San Diego, Calif.In addition, various resources are publicly available to codon-optimizecoding region sequences, e.g., the “backtranslation” function athttp://www.entelechon.com/bioinformatics/backtranslation.php?lang=eng(visited Apr. 15, 2008) and the “backtranseq” function available athttp://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html. Constructing arudimentary algorithm to assign codons based on a given frequency canalso easily be accomplished with basic mathematical functions by one ofordinary skill in the art.

Codon-optimized coding regions can be designed by various methods knownto those skilled in the art including software packages such as“synthetic gene designer”(http://phenotype.biosci.umbc.edu/codon/sgd/index.php).

Standard recombinant DNA and molecular cloning techniques used here arewell known in the art and are described by Sambrook et al. (Sambrook,Fritsch, and Maniatis, Molecular Cloning: A Laboratory Manual, SecondEdition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1989) (hereinafter “Maniatis”); and by Silhavy et al. (Silhavy et al.,Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press ColdSpring Harbor, N.Y., 1984); and by Ausubel, F. M. et al., (Ausubel etal., Current Protocols in Molecular Biology, published by GreenePublishing Assoc. and Wiley-Interscience, 1987).

Alcohol Dehydrogenase (ADH) Enzymes

Alcohol dehydrogenases (ADH) are a broad class of enzymes that catalyzethe interconversion of aldehydes to alcohols as part of various pathwaysin cellular milieu. ADH enzymes are universal and are classified intomultiple families based on either the length of the amino-acid sequenceor the type of metal cofactors they use.

More than 150 structures are available in the Protein Data Bank (PDB)for a variety of ADH enzymes. The enzymes are highly divergent anddifferent ADHs exist as oligomers with varying subunit compositions.FIG. 4 shows the phylogenetic relationship of oxidoreductase enzymes inSaccharomyces cerevisiae, E. coli, Homo sapiens, C. elegans, Drosophilamelanogaster, and Arabidopsis thaliana that are related to horse liverADH and Achromobacter xylosoxidans SadB.

FIG. 5 shows the phylogenetic relationship of specific ADH enzymesequences more closely related to Achromobacter xylosoxidans SadB bysequence.

In one embodiment, ADH enzymes suitable for use in the present inventionhave a very high k_(cat) for the conversion of a lower alkyl aldehyde toa corresponding lower alkyl alcohol. In another embodiment, ADH enzymessuitable for use have a very low k_(cat) for the conversion of a loweralkyl alcohol to a corresponding lower alkyl aldehyde. In anotherembodiment, ADH enzymes suitable for use have a low K_(M) for loweralkyl aldehydes. In another embodiment, suitable ADH enzymes have a highK_(M) for lower alkyl alcohols. In another embodiment, suitable ADHenzymes preferentially use NADH as a cofactor during reductionreactions. In another embodiment, suitable ADH enzymes have one or moreof the following characteristics: a very high k_(cat) for the conversionof a lower alkyl aldehyde to a corresponding lower alkyl alcohol; a verylow k_(cat) for the conversion of a lower alkyl alcohol to acorresponding lower alkyl aldehyde; a low K_(M) for lower alkylaldehydes; a high K_(M) for lower alkyl alcohols; and preferential useof NADH as a cofactor during reduction reactions. In another embodiment,suitable ADH enzymes have a high K_(I) for lower alkyl alcohols. Inanother embodiment, suitable ADH enzymes have two or more of the abovecharacteristics.

In one embodiment, ADH enzymes suitable for use in the present inventionoxidize cofactors in the presence and absence of a lower alkyl alcoholfaster relative to control polypeptides. In one embodiment, the controlpolypeptide is Achromobacter xylosoxidans SadB having the amino acidsequence of SEQ ID NO: 26.

In another embodiment, suitable ADH enzymes have K_(M) for a lower alkylaldehyde that are lower relative to a control polypeptide. In anotherembodiment, suitable ADH enzymes have a K_(M) for a lower alkyl aldehydethat is at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%,70%, 80%, or 90% lower relative to a control polypeptide. In oneembodiment, the control polypeptide is Achromobacter xylosoxidans SadBhaving the amino acid sequence of SEQ ID NO: 26. In one embodiment, thelower alkyl aldehyde is isobutyraldehyde.

In another embodiment, suitable ADH enzymes have a K_(I) for a loweralkyl alcohol that is higher relative to a control polypeptide. Inanother embodiment, suitable ADH enzymes have a lower alkyl alcoholK_(I) that is at least about 10%, 50%, 100%, 200%, 300%, 400%, or 500%higher relative to a control polypeptide. In one embodiment, the controlpolypeptide is Achromobacter xylosoxidans SadB having the amino acidsequence of SEQ ID NO: 26. In one embodiment, the lower alkyl alcohol isisobutanol.

In another embodiment, suitable ADH enzymes have a k_(cat)/K_(M) for alower alkyl aldehyde that is higher relative to a control polypeptide.In another embodiment, suitable ADH enzymes have a k_(cat)/K_(M) that isat least about 10%, 50%, 100%, 200%, 300%, 400%, 500%, 600%, 800%, or1000% higher relative to a control polypeptide. In one embodiment, thecontrol polypeptide is Achromobacter xylosoxidans SadB having the aminoacid sequence of SEQ ID NO: 26. In one embodiment, the lower alkylaldehyde is isobutyraldehyde.

In another embodiment, suitable ADH enzymes have two or more of theabove characteristics. In another embodiment, suitable ADH enzymes havethree or more of the above characteristics. In another embodiment,suitable ADH enzymes have all four of the above characteristics. In oneembodiment, suitable ADH enzymes preferentially use NADH as a cofactor.

In one embodiment, suitable ADH enzymes for use in the present inventioncatalyze reduction reactions optimally at host cell physiologicalconditions. In another embodiment, suitable ADH enzymes for use in thepresent invention catalyze reduction reactions optimally from about pH 4to about pH 9. In another embodiment, suitable ADH enzymes for use inthe present invention catalyze reduction reactions optimally from aboutpH 5 to about pH 8. In another embodiment, suitable ADH enzymes for usein the present invention catalyze reduction reactions optimally fromabout pH 6 to about pH 7. In another embodiment, suitable ADH enzymesfor use in the present invention catalyze reduction reactions optimallyfrom about pH 6.5 to about pH 7. In another embodiment, suitable ADHenzymes for use in the present invention catalyze reduction reactionsoptimally at about pH 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, or 9. Inanother embodiment, suitable ADH enzymes for use in the presentinvention catalyze reduction reactions optimally at about pH 7.

In one embodiment, suitable ADH enzymes for use in the present inventioncatalyze reduction reactions optimally at up to about 70° C. In anotherembodiment, suitable ADH enzymes catalyze reduction reactions optimallyat about 10° C., 15° C., 20° C., 25° C., 30° C., 35° C., 40° C., 45° C.,50° C., 55° C., 60° C., 65° C., or 70° C. In another embodiment,suitable ADH enzymes catalyze reduction reactions optimally at about 30°C.

In one embodiment, suitable ADH enzymes for use in the present inventioncatalyze the conversion of an aldehyde to an alcohol in the presence ofa lower alkyl alcohol at a concentration up to about 50 g/L. In anotherembodiment, suitable ADH enzymes catalyze the conversion of an aldehydeto an alcohol in the presence of a lower alkyl alcohol at aconcentration of at least about 10 g/L, 15 g/L, 20 g/L, 25 g/L, 30 g/L,35 g/L, 40 g/L, 45 g/L, or 50 g/L. In another embodiment, suitable ADHenzymes catalyze the conversion of an aldehyde to an alcohol in thepresence of a lower alkyl alcohol at a concentration of at least about20 g/L. In some embodiments, the lower alkyl alcohol is butanol. In someembodiments, the lower alkyl aldehyde is isobutyraldehyde and the loweralkyl alcohol is isobutanol.

Recombinant Host Cells for ADH Enzyme Expression

One aspect of the present invention is directed to recombinant hostcells that express ADH enzymes having the above-outlined activities.Non-limiting examples of host cells for use in the invention includebacteria, cyanobacteria, filamentous fungi and yeasts.

In one embodiment, the recombinant host cell of the invention is abacterial or a cyanobacterial cell. In another embodiment, therecombinant host cell is selected from the group consisting of:Salmonella, Arthrobacter, Bacillus, Brevibacterium, Clostridium,Corynebacterium, Gluconobacter, Nocardia, Pseudomonas, Rhodococcus,Streptomyces, Zymomonas, Escherichia, Lactobacillus, Enterococcus,Alcaligenes, Klebsiella, Serratia, Shigella, Alcaligenes, Erwinia,Paenibacillus, and Xanthomonas. In some embodiments, the recombinanthost cell is E. coli, S. cerevisiae, or L. plantarum.

In another embodiment, the recombinant host cell of the invention is afilamentous fungi or yeast cell. In another embodiment, the recombinanthost cell is selected from the group consisting of: Saccharomyces,Pichia, Hansenula, Yarrowia, Aspergillus, Kluyveromyces, Pachysolen,Rhodotorula, Zygosaccharomyces, Galactomyces, Schizosaccharomyces,Torulaspora, Debayomyces, Williopsis, Dekkera, Kloeckera, Metschnikowia,Issatchenkia, and Candida.

In one embodiment, the recombinant host cell of the invention produces alower alkyl alcohol at a yield of greater than about 25%, 30%, 35%, 40%,45%, 50%, 55%, 60%, 65%, 70%, 75%, or 90% of theoretical. In oneembodiment, the recombinant host cell of the invention produces a loweralkyl alcohol at a yield of greater than about 25% of theoretical. Inanother embodiment, the recombinant host cell of the invention producesa lower alkyl alcohol at a yield of greater than about 40% oftheoretical. In another embodiment, the recombinant host cell of theinvention produces a lower alkyl alcohol at a yield of greater thanabout 50% of theoretical. In another embodiment, the recombinant hostcell of the invention produces a lower alkyl alcohol at a yield ofgreater than about 75% of theoretical. In another embodiment, therecombinant host cell of the invention produces a lower alkyl alcohol ata yield of greater than about 90% of theoretical. In some embodiments,the lower alkyl alcohol is butanol. In some embodiments, the lower alkylalcohol is isobutanol.

Non-limiting examples of lower alkyl alcohols produced by therecombinant host cells of the invention include butanol, propanol,isopropanol, and ethanol. In one embodiment, the recombinant host cellsof the invention produce isobutanol. In another embodiment, therecombinant host cells of the invention do not produce ethanol.

U.S. Publ. No. 2007/0092957 A1 discloses the engineering of recombinantmicroorganisms for production of isobutanol (2-methylpropan-1-ol). U.S.Publ. No. 2008/0182308 A1 discloses the engineering of recombinantmicroorganisms for production of 1-butanol. U.S. Publ. Nos. 2007/0259410A1 and 2007/0292927 A1 disclose the engineering of recombinantmicroorganisms for production of 2-butanol. Multiple pathways aredescribed for biosynthesis of isobutanol and 2-butanol. The last step inall described pathways for all three products is the reduction of a moreoxidized moiety to the alcohol moiety by an enzyme with butanoldehydrogenase activity. The methods disclosed in these publications canbe used to engineer the recombinant host cells of the present invention.The information presented in these publications is hereby incorporatedby reference in its entirety.

In embodiments, the recombinant microbial host cell produces isobutanol.In embodiments, the recombinant microbial host cell comprises at leasttwo heterologous polynucleotides encoding enzymes which catalyze asubstrate to product conversion selected from the group consisting of:pyruvate to acetolactate; acetolactate to 2,3-dihydroxyisovalerate;2,3-dihydroxyisovalerate to alpha-ketoisovalerate; alpha-ketoisovalerateto isobutyraldehyde, and isobutyraldehyde to isobutanol. In embodiments,the recombinant microbial host cell comprises at least threeheterologous polynucleotides encoding enzymes which catalyze a substrateto product conversion selected from the group consisting of: pyruvate toacetolactate; acetolactate to 2,3-dihydroxyisovalerate;2,3-dihydroxyisovalerate to alpha-ketoisovalerate; alpha-ketoisovalerateto isobutyraldehyde, and isobutyraldehyde to isobutanol. In embodiments,the recombinant microbial host cell comprises at least four heterologouspolynucleotides encoding enzymes which catalyze a substrate to productconversion selected from the group consisting of: pyruvate toacetolactate; acetolactate to 2,3-dihydroxyisovalerate;2,3-dihydroxyisovalerate to alpha-ketoisovalerate; alpha-ketoisovalerateto isobutyraldehyde, and isobutyraldehyde to isobutanol. In embodiments,the recombinant microbial host cell comprises heterologouspolynucleotides encoding enzymes which catalyze the conversion ofpyruvate to acetolactate; acetolactate to 2,3-dihydroxyisovalerate;2,3-dihydroxyisovalerate to alpha-ketoisovalerate; alpha-ketoisovalerateto isobutyraldehyde, and isobutyraldehyde to isobutanol. In embodiments,(a) the polypeptide that catalyzes a substrate to product conversion ofpyruvate to acetolactate is acetolactate synthase having the EC number2.2.1.6; (b) the polypeptide that catalyzes a substrate to productconversion of acetolactate to 2,3-dihydroxyisovalerate is acetohydroxyacid isomeroreducatase having the EC number 1.1.186; (c) the polypeptidethat catalyzes a substrate to product conversion of2,3-dihydroxyisovalerate to alpha-ketoisovalerate is acetohydroxy aciddehydratase having the EC number 4.2.1.9; and (d) the polypeptide thatcatalyzes a substrate to product conversion of alpha-ketoisovalerate toisobutyraldehyde is branched-chain alpha-keto acid decarboxylase havingthe EC number 4.1.1.72.

In embodiments, the recombinant microbial host cell further comprises atleast one heterologous polynucleotide encoding an enzyme which catalyzesa substrate to product conversion selected from the group consisting of:pyruvate to alpha-acetolactate; alpha-acetolactate to acetoin; acetointo 2,3-butanediol; 2,3-butanediol to 2-butanone; and 2-butanone to2-butanol; and wherein said microbial host cell produces 2-butanol. Inembodiments, (a) the polypeptide that catalyzes a substrate to productconversion of pyruvate to acetolactate is acetolactate synthase havingthe EC number 2.2.1.6; (b) the polypeptide that catalyzes a substrate toproduct conversion of acetolactate to acetoin is acetolactatedecarboxylase having the EC number 4.1.1.5; (c) the polypeptide thatcatalyzes a substrate to product conversion of acetoin to 2,3-butanediolis butanediol dehydrogenase having the EC number 1.1.1.76 or EC number1.1.1.4; (d) the polypeptide that catalyzes a substrate to productconversion of butanediol to 2-butanone is butanediol dehydratase havingthe EC number 4.2.1.28. In embodiments, (e) the polypeptide thatcatalyzes a substrate to product conversion of 2-butanone to 2-butanolis 2-butanol dehydrogenase having the EC number 1.1.1.1.

In embodiments, the recombinant microbial host cell further comprises atleast one heterologous polynucleotide encoding an enzyme which catalyzesa substrate to product conversion selected from the group consisting of:acetyl-CoA to acetoacetyl-CoA; acetoacetyl-CoA to 3-hydroxybutyryl-CoA;3-hydroxybutyryl-CoA to crotonyl-CoA; crotonyl-CoA to butyryl-CoA;butyryl-CoA to butyraldehyde; butyraldehyde to 1-butanol; and whereinsaid microbial host cell produces 1-butanol. In embodiments, (a) thepolypeptide that catalyzes a substrate to product conversion ofacetyl-CoA to acetoacetyl-CoA is acetyl-CoA acetyltransferase having theEC number 2.3.1.9 or 2.3.1.16; (b) the polypeptide that catalyzes asubstrate to product conversion of acetoacetyl-CoA to3-hydroxybutyryl-CoA is 3-hydroxybutyryl-CoA dehydrogenase having the ECnumber 1.1.1.35, 1.1.1.30, 1.1.1.157, or 1.1.1.36; (c) the polypeptidethat catalyzes a substrate to product conversion of 3-hydroxybutyryl-CoAto crotonyl-CoA is crotonase having the EC number 4.2.1.17 or 4.2.1.55;(d) the polypeptide that catalyzes a substrate to product conversion ofcrotonyl-CoA to butyryl-CoA is butyryl-CoA dehydrogenase having the ECnumber 1.3.1.44 or 1.3.1.38; (e) the polypeptide that catalyzes asubstrate to product conversion of butyryl-CoA to butyrylaldehyde isbutyraldehyde dehydrogenase having the EC number 1.2.1.57. Inembodiments, (f) the polypeptide that catalyzes a substrate to productconversion of butyrylaldehyde to 1-butanol is 1-butanol dehydrogenasehaving the EC number 1.1.1.1.

In some embodiments, the recombinant microbial host cell furthercomprises at least one modification which improves carbon flow to theisobutanol pathway. In some embodiments, the recombinant microbial hostcell further comprises at least one modification which improves carbonflow to the 1-butanol pathway. In some embodiments, the recombinantmicrobial host cell further comprises at least one modification whichimproves carbon flow to the 2-butanol pathway.

Methods for Producing Lower Alkyl Alcohols

Another aspect of the present invention is directed to methods forproducing lower alkyl alcohols. These methods primarily employ therecombinant host cells of the invention. In one embodiment, the methodof the present invention comprises providing a recombinant host cell asdiscussed above, contacting the recombinant host cell with a fermentablecarbon substrate in a fermentation medium under conditions whereby thelower alkyl alcohol is produced and recovering the lower alkyl alcohol.

Carbon substrates may include, but are not limited to, monosaccharides(such as fructose, glucose, mannose, rhamnose, xylose or galactose),oligosaccharides (such as lactose, maltose, or sucrose), polysaccharidessuch as starch, maltodextrin, or cellulose or mixtures thereof andunpurified mixtures from renewable feedstocks such as cheese wheypermeate, cornsteep liquor, sugar beet molasses, and barley malt. Othercarbon substrates may include ethanol, lactate, succinate, or glycerol.

Additionally, the carbon substrate may also be a one carbon substratesuch as carbon dioxide, or methanol for which metabolic conversion intokey biochemical intermediates has been demonstrated. In addition to oneand two carbon substrates, methylotrophic organisms are also known toutilize a number of other carbon containing compounds such asmethylamine, glucosamine and a variety of amino acids for metabolicactivity. For example, methylotrophic yeasts are known to utilize thecarbon from methylamine to form trehalose or glycerol (Bellion et al.,Microb. Growth C1 Compd., [Int. Symp.], 7th (1993), 415 32, Editor(s):Murrell, J. Collin; Kelly, Don P. Publisher: Intercept, Andover, UK).Similarly, various species of Candida will metabolize alanine or oleicacid (Sulter et al., Arch. Microbiol. 153:485-489 (1990)). Hence, it iscontemplated that the source of carbon utilized in the present inventionmay encompass a wide variety of carbon containing substrates and willonly be limited by the choice of organism.

Although it is contemplated that all of the above mentioned carbonsubstrates and mixtures thereof are suitable in the present invention,preferred carbon substrates are glucose, fructose, and sucrose, ormixtures of these with C5 sugars such as xylose and/or arabinose foryeasts cells modified to use C5 sugars. Sucrose may be derived fromrenewable sugar sources such as sugar cane, sugar beets, cassava, sweetsorghum, and mixtures thereof. Glucose and dextrose may be derived fromrenewable grain sources through saccharification of starch basedfeedstocks including grains such as corn, wheat, rye, barley, oats, andmixtures thereof. In addition, fermentable sugars may be derived fromrenewable cellulosic or lignocellulosic biomass through processes ofpretreatment and saccharification, as described, for example, in U.S.Publ. No. 2007/0031918 A1, which is herein incorporated by reference.Biomass refers to any cellulosic or lignocellulosic material andincludes materials comprising cellulose, and optionally furthercomprising hemicellulose, lignin, starch, oligosaccharides and/ormonosaccharides. Biomass may also comprise additional components, suchas protein and/or lipid. Biomass may be derived from a single source, orbiomass can comprise a mixture derived from more than one source; forexample, biomass may comprise a mixture of corn cobs and corn stover, ora mixture of grass and leaves. Biomass includes, but is not limited to,bioenergy crops, agricultural residues, municipal solid waste,industrial solid waste, sludge from paper manufacture, yard waste, woodand forestry waste. Examples of biomass include, but are not limited to,corn grain, corn cobs, crop residues such as corn husks, corn stover,grasses, wheat, wheat straw, barley, barley straw, hay, rice straw,switchgrass, waste paper, sugar cane bagasse, sorghum, soy, componentsobtained from milling of grains, trees, branches, roots, leaves, woodchips, sawdust, shrubs and bushes, vegetables, fruits, flowers, animalmanure, and mixtures thereof.

The carbon substrates may be provided in any media that is suitable forhost cell growth and reproduction. Non-limiting examples of media thatcan be used include M122C, MOPS, SOB, TSY, YMG, YPD, 2XYT, LB, M17, orM9 minimal media. Other examples of media that can be used includesolutions containing potassium phosphate and/or sodium phosphate.Suitable media can be supplemented with NADH or NADPH.

The fermentation conditions for producing a lower alkyl alcohol may varyaccording to the host cell being used. In one embodiment, the method forproducing a lower alkyl alcohol is performed under anaerobic conditions.In one embodiment, the method for producing a lower alkyl alcohol isperformed under aerobic conditions. In one embodiment, the method forproducing a lower alkyl alcohol is performed under microaerobicconditions.

In one embodiment, the method for producing a lower alkyl alcoholresults in a titer of at least about 20 g/L of a lower alkyl alcohol. Inanother embodiment, the method for producing a lower alkyl alcoholresults in a titer of at least about 30 g/L of a lower alkyl alcohol. Inanother embodiment, the method for producing a lower alkyl alcoholresults in a titer of about 10 g/L, 15 g/L, 20 g/L, 25 g/L, 30 g/L, 35g/L or 40 g/L of a lower alkyl alcohol.

Non-limiting examples of lower alkyl alcohols produced by the methods ofthe invention include butanol, isobutanol, propanol, isopropanol, andethanol. In one embodiment, isobutanol is produced.

In embodiments, isobutanol is produced. In embodiments, the method forproducing isobutanol comprises:

-   -   (a) providing a recombinant host cell comprising a heterologous        polypeptide which catalyzes the substrate to product conversion        of isobutyraldehyde to isobutanol and which has one or more of        the following characteristics:        -   (i) the K_(M) value of a lower alkyl aldehyde is lower for            the polypeptide relative to a control polypeptide having the            amino acid sequence of SEQ ID NO: 26;        -   (ii) the K_(I) value for a lower alkyl aldehyde for the            polypeptide is higher relative to control polypeptide having            the amino acid sequence of SEQ ID NO: 26;        -   (iii) the k_(cat)/K_(M) value for a lower alkyl aldehyde for            the polypeptide is higher relative to a control polypeptide            having the amino acid sequence of SEQ ID NO: 26; and    -   (b) contacting the host cell of (a) with a carbon substrate        under conditions whereby isobutanol is produced.

In embodiments, 2-butanol is produced. In embodiments, the method forproducing 2-butanol comprises:

-   -   (a) providing a recombinant microbial host cell comprising a        heterologous polypeptide which catalyzes the substrate to        product conversion of 2-butanone to 2-butanol and which has one        or more of the following characteristics:        -   (i) the K_(M) value for a lower alkyl aldehyde is lower for            the polypeptide relative to a control polypeptide having the            amino acid sequence of SEQ ID NO: 26;        -   (ii) the K_(I) value for a lower alkyl alcohol for the            polypeptide is higher relative to a control polypeptide            having the amino acid sequence of SEQ ID NO: 26; and        -   (iii) the k_(cat)/K_(M) value for a lower alkyl aldehyde for            the polypeptide is higher relative to a control polypeptide            having the amino acid sequence of SEQ ID NO: 26; and    -   (b) contacting the host cell of (a) with a carbon substrate        under conditions whereby 2-butanol is produced.

In embodiments, 1-butanol is produced. In embodiments, the method forproducing 1-butanol comprises:

-   -   (a) providing a recombinant microbial host cell comprising a        heterologous polypeptide which catalyzes the substrate to        product conversion of butyraldehyde to 1-butanol and which has        one or more of the following characteristics:        -   (i) the K_(M) value for a lower alkyl aldehyde is lower for            the polypeptide relative to a control polypeptide having the            amino acid sequence of SEQ ID NO: 26;        -   (ii) the K_(I) value for a lower alkyl alcohol for the            polypeptide is higher relative to a control polypeptide            having the amino acid sequence of SEQ ID NO: 26; and        -   (iii) the k_(cat)/K_(M) value for a lower alkyl aldehyde for            the polypeptide is higher relative to a control polypeptide            having the amino acid sequence of SEQ ID NO: 26; and    -   (b) contacting the host cell of (a) with a carbon substrate        under conditions whereby 1-butanol is produced.

Biosynthetic Pathways

Recombinant microbial production hosts expressing a 1-butanolbiosynthetic pathway (Donaldson et al., U.S. Patent ApplicationPublication No. US20080182308A1, incorporated herein by reference), a2-butanol biosynthetic pathway (Donaldson et al., U.S. PatentPublication Nos. US 20070259410A1 and US 20070292927, and US20090155870, all incorporated herein by reference), and an isobutanolbiosynthetic pathway (Maggio-Hall et al., U.S. Patent Publication No. US20070092957, incorporated herein by reference) have been described inthe art. Certain suitable proteins having the ability to catalyze theindicated substrate to product conversions are described therein andother suitable proteins are described in the art. The skilled personwill appreciate that polypeptides having the activity of such pathwaysteps can be isolated from a variety of sources and can be used in arecombinant host cell disclosed herein. For example, US Published PatentApplication Nos. US20080261230 and US20090163376, US20100197519, andU.S. application Ser. No. 12/893,077 describe acetohydroxy acidisomeroreductases; US20070092957 and US20100081154, describe suitabledihydroxyacid dehydratases.

Equipped with this disclosure, a person of skill in the art will be ableto utilize publicly available sequences to construct relevant pathwaysin the host cells provided herein. Additionally, one of skill in theart, equipped with this disclosure, will appreciate other suitableisobutanol, 1-butanol, or 2-butanol pathways.

Isobutanol Biosynthetic Pathway

Isobutanol can be produced from carbohydrate sources with recombinantmicroorganisms by through various biosynthetic pathways. Suitablepathways converting pyruvate to isobutanol include the four completereaction pathways shown in FIG. 6. A suitable isobutanol pathway (FIG.6, steps a to e), comprises the following substrate to productconversions:

-   -   a) pyruvate to acetolactate, as catalyzed for example by        acetolactate synthase,    -   b) acetolactate to 2,3-dihydroxyisovalerate, as catalyzed for        example by acetohydroxy acid isomeroreductase,    -   c) 2,3-dihydroxyisovalerate to α-ketoisovalerate, as catalyzed        for example by acetohydroxy acid dehydratase,    -   d) α-ketoisovalerate to isobutyraldehyde, as catalyzed for        example by a branched-chain keto acid decarboxylase, and    -   e) isobutyraldehyde to isobutanol, as catalyzed for example by,        a branched-chain alcohol dehydrogenase.

Another suitable pathway for converting pyruvate to isobutanol comprisesthe following substrate to product conversions (FIG. 6, stepsa,b,c,f,g,e):

-   -   a) pyruvate to acetolactate, as catalyzed for example by        acetolactate synthase,    -   b) acetolactate to 2,3-dihydroxyisovalerate, as catalyzed for        example by acetohydroxy acid isomeroreductase,    -   c) 2,3-dihydroxyisovalerate to α-ketoisovalerate, as catalyzed        for example by acetohydroxy acid dehydratase,    -   f) α-ketoisovalerate to isobutyryl-CoA, as catalyzed for example        by a branched-chain keto acid dehydrogenase,    -   g) isobutyryl-CoA to isobutyraldehyde, as catalyzed for example        by an acylating aldehyde dehydrogenase, and    -   e) isobutyraldehyde to isobutanol, as catalyzed for example by,        a branched-chain alcohol dehydrogenase.

The first three steps in this pathway (a,b,c) are the same as thosedescribed above.

Another suitable pathway for converting pyruvate to isobutanol comprisesthe following substrate to product conversions (FIG. 6, stepsa,b,c,h,i,j,e):

-   -   a) pyruvate to acetolactate, as catalyzed for example by        acetolactate synthase,    -   b) acetolactate to 2,3-dihydroxyisovalerate, as catalyzed for        example by acetohydroxy acid isomeroreductase,    -   c) 2,3-dihydroxyisovalerate to α-ketoisovalerate, as catalyzed        for example by acetohydroxy acid dehydratase,    -   h) α-ketoisovalerate to valine, as catalyzed for example by        valine dehydrogenase or transaminase,    -   i) valine to isobutylamine, as catalyzed for example by valine        decarboxylase,    -   j) isobutylamine to isobutyraldehyde, as catalyzed for example        by omega transaminase, and    -   e) isobutyraldehyde to isobutanol, as catalyzed for example by,        a branched-chain alcohol dehydrogenase.

The first three steps in this pathway (a,b,c) are the same as thosedescribed above.

A fourth suitable isobutanol biosynthetic pathway comprises thesubstrate to product conversions shown as steps k,g,e in FIG. 6.

1-Butanol Biosynthetic Pathway

An example of a suitable biosynthetic pathway for production of1-butanol is disclosed in U.S. Patent Application Publication No. US2008/0182308 A1. As disclosed this publication, steps in the disclosed1-butanol biosynthetic pathway include conversion of:

-   -   acetyl-CoA to acetoacetyl-CoA, as catalyzed for example by        acetyl-CoA acetyltransferase;    -   acetoacetyl-CoA to 3-hydroxybutyryl-CoA, as catalyzed for        example by 3-hydroxybutyryl-CoA dehydrogenase;    -   3-hydroxybutyryl-CoA to crotonyl-CoA, as catalyzed for example        by crotonase;    -   crotonyl-CoA to butyryl-CoA, as catalyzed for example by        butyryl-CoA dehydrogenase;    -   butyryl-CoA to butyraldehyde, as catalyzed for example by        butyraldehyde dehydrogenase; and    -   butyraldehyde to 1-butanol, as catalyzed for example by butanol        dehydrogenase.

2-Butanol Biosynthetic Pathway

An example of a suitable biosynthetic pathway for production of2-butanol is described by Donaldson et al. in U.S. Patent ApplicationPublication Nos. US20070259410A1 and US 20070292927A1, and in PCTPublication WO 2007/130521, all of which are incorporated herein byreference. Steps of a suitable 2-butanol biosynthetic pathway comprisesthe following substrate to product conversions:

-   -   a) pyruvate to alpha-acetolactate, which may be catalyzed, for        example, by acetolactate synthase;    -   b) alpha-acetolactate to acetoin, which may be catalyzed, for        example, by acetolactate decarboxylase;    -   c) acetoin to 2,3-butanediol, which may be catalyzed, for        example, by butanediol dehydrogenase;    -   d) 2,3-butanediol to 2-butanone, which may be catalyzed, for        example, by butanediol dehydratase; and    -   e) 2-butanone to 2-butanol, which may be catalyzed, for example,        by 2-butanol dehydrogenase.

Additional Modifications

Additional modifications that may be useful in cells provided hereininclude modifications to reduce pyruvate decarboxylase and/orglycerol-3-phosphate dehydrogenase activity as described in US PatentApplication Publication No. 20090305363 (incorporated herein byreference), modifications to a host cell that provide for increasedcarbon flux through an Entner-Doudoroff Pathway or reducing equivalentsbalance as described in US Patent Application Publication No.20100120105 (incorporated herein by reference). Yeast strains withincreased activity of heterologous proteins that require binding of anFe—S cluster for their activity are described in US ApplicationPublication No. 20100081179 (incorporated herein by reference). Othermodifications include modifications in an endogenous polynucleotideencoding a polypeptide having dual-role hexokinase activity, describedin U.S. Provisional Application No. 61/290,639, integration of at leastone polynucleotide encoding a polypeptide that catalyzes a step in apyruvate-utilizing biosynthetic pathway described in U.S. ProvisionalApplication No. 61/380,563 (both referenced provisional applications areincorporated herein by reference in their entirety). Additionalmodifications that may be suitable for embodiments herein are describedin U.S. application Ser. No. 12/893,089.

Additionally, host cells comprising at least one deletion, mutation,and/or substitution in an endogenous gene encoding a polypeptideaffecting Fe—S cluster biosynthesis are described in U.S. ProvisionalPatent Application No. 61/305,333 (incorporated herein by reference),and host cells comprising a heterologous polynucleotide encoding apolypeptide with phosphoketolase activity and host cells comprising aheterologous polynucleotide encoding a polypeptide withphosphotransacetylase activity are described in U.S. Provisional PatentApplication No. 61/356,379.

Identification and Isolation of High Activity ADH Enzymes

The present invention is directed to devising a strategy and identifyingseveral ADH enzymes with superior properties towards the conversion ofisobutyraldehyde to isobutanol in a host organism that has beenengineered for isobutanol production. The process of ADH candidateselection involves searching among the naturally existing enzymes.Enzymes are identified based on their natural propensity to utilizealdehydes as preferred substrates and convert them to the respectivealcohols with reasonably high k_(cat) and/or low K_(M) values for thecorresponding aldehyde substrates, as documented by literature examples.Once a set of candidates is identified, the strategy involves using thisset to isolate closely-related homologues via bioinformatics analysis.Therefore, in one embodiment, the screening method of the inventioncomprises performing a bioinformatics or literature search for candidateADH enzymes. In one embodiment, the bioinformatics search uses aphylogenetic analysis.

The protein-encoding DNA sequences of the candidate genes are eitheramplified directly from the host organisms or procured ascodon-optimized synthetic genes for expression in a host cell, such asE. coli. Various ADH candidates utilized herein are listed in Table 3.

TABLE 3 Polypeptide Polynucleotide SEQ ID Gene SEQ ID NO: NO:Horse-liver ADH 1 21 Saccharomyces cerevisiae 2 22 ADH6 Saccharomycescerevisiae 3 23 ADH7 Clostridium acetobutylicum 4 24 BdhA Clostridiumacetobutylicum 5 25 BdhB Achromobacter xylosoxidans 6 26 SadB Bos taurusARD 7 27 Rana perezi ADH8 8 28 Clostridium beijerinckii ADH 9 29Entamoeba histolytica ADH1 10 30 Beijerinckia indica ADH 11 31 Rattusnorvegicus ADH1 12 32 Thermus sp. ATN1 ADH 13 33 Phenylobacteriumzucineum 14 34 HLK1 ADH Methyloceclla silvestris BL2 15 35 ADHAcinetobacter baumannii 16 36 AYE ADH Geobacillus sp. WCH70 17 37 ADHVanderwaltozyma polyspora 18 38 DSM 70294 ADH Mucor circinelloides ADH19 39 Rhodococcus erythropolis 20 40 PR4 ADH

The present invention is not limited to the ADH enzymes listed in Table3. Additional candidates can be identified based on sequence homologiesto these candidates or candidates can be derived from these sequencesvia mutagenesis and/or protein evolution. Suitable ADH enzymes includeADH enzymes having at least about 95% identity to the sequences providedherein.

Tables 4 and 5 provide the polynucleotide (codon-optimized forexpression E. coli except for SEQ ID NOs. 2, 3, 4, 5, and 6) andpolypeptides sequences of the candidate ADH enzymes presented in Table3, respectively.

TABLE 4 SEQ ID NO POLYNUCLEOTIDE SEQUENCE  1atgtcaacagccggtaaagttattaagtgtaaagcggcagttttgtgggaagagaaaaagccgtttagcatagaagaagtagaagtagcgccaccaaaagcacacgaggttagaatcaagatggttgccaccggaatctgtagatccgacgaccatgtggtgagtggcactctagttactcctttgccagtaatcgcgggacacgaggctgccggaatcgttgaatccataggtgaaggtgttaccactgttcgtcctggtgataaagtgatcccactgttcactcctcaatgtggtaagtgtagagtctgcaaacatcctgagggtaatttctgccttaaaaatgatttgtctatgcctagaggtactatgcaggatggtacaagcagatttacatgcagagggaaacctatacaccatttccttggtacttctacattttcccaatacacagtggtggacgagatatctgtcgctaaaatcgatgcagatcaccactggaaaaagtttgcttgatagggtgcggattttccaccggttacggttccgcagttaaagttgcaaaggttacacagggttcgacttgtgcagtattcggtttaggaggagtaggactaagcgttattatggggtgtaaagctgcaggcgcagcgaggattataggtgtagacatcaataaggacaaatttgcaaaagctaaggaggtcggggctactgaatgtgttaaccctcaagattataagaaaccaatacaagaagtccttactgaaatgtcaaacggtggagttgatttctcttttgaagttataggccgtcttgatactatggtaactgcgttgtcctgctgtcaagaggcatatggagtcagtgtgatcgtaggtgttcctcctgattcacaaaatttgtcgatgaatcctatgctgttgctaagcggtcgtacatggaagggagctatatttggcggttttaagagcaaggatagtgttccaaaacttgttgccgactttatggcgaagaagtttgctcttgatcctttaattacacatgtattgccattcgagaaaatcaatgaagggtttgatttgttaagaagtggtgaatctattcgtacaattttaactttttga  2atgtcttatcctgagaaatttgaaggtatcgctattcaatcacacgaagattggaaaaacccaaagaagacaaagtatgacccaaaaccattttacgatcatgacattgacattaagatcgaagcatgtggtgtctgcggtagtgatattcattgtgcagctggtcattggggcaatatgaagatgccgctagtcgttggtcatgaaatcgttggtaaagttgtcaagctagggcccaagtcaaacagtgggttgaaagtcggtcaacgtgttggtgtaggtgctcaagtatttcatgatggaatgtgaccgttgtaagaatgataatgaaccatactgcaccaagtttgttaccacatacagtcagccttatgaagacggctatgtgtcgcagggtggctatgcaaactacgtcagagttcatgaacattttgtggtgcctatcccagagaatattccatcacatttggctgctccactattatgtggtggtttgactgtgtactctccattggttcgtaacggttgcggtccaggtaaaaaagttggtatagttggtcttggtggtatcggcagtatgggtacattgatttccaaagccatgggggcagagacgtatgttatttctcgttcttcgagaaaaagagaagatgcaatgaagatgggcgccgatcactacattgctacattagaagaaggtgattggggtgaaaagtactttgacaccttcgacctgattgtagtctgtgatcctcccttaccgacattgacttcaacattatgccaaaggctatgaaggttggtggtagaattgtctcaatctctataccagaacaacacgaaatgttatcgctaaagccatatggcttaaaggctgtctccatttcttacagtgctttaggttccatcaaagaattgaaccaactcttgaaattagtctctgaaaaagatatcaaaatttgggtggaaacattacctgttggtgaagccggcgtccatgaagccttcgaaaggatggaaaagggtgacgttagatatagatttaccttagtcggctacgacaaagaattttcagactag 3atgctttacccagaaaaatttcagggcatcggtatttccaacgcaaaggattggaagcatcctaaattagtgagttttgacccaaaaccctttggcgatcatgacgttgatgttgaaattgaagcctgtggtatctgcggatctgattttcatatagccgttggtaattggggtccagtcccagaaaatcaaatccttggacatgaaataattggccgcgtggtgaaggttggatccaagtgccacactggggtaaaaatcggtgaccgtgttggtgttggtgcccaagccttggcgtgttttgagtgtgaacgttgcaaaagtgacaacgagcaatactgtaccaatgaccacgttttgactatgtggactccttacaaggacggctacatttcacaaggaggctttgcctcccacgtgaggcttcatgaacactttgctattcaaataccagaaaatattccaagtccgctagccgctccattattgtgtggtggtattacagttttctctccactactaagaaatggctgtggtccaggtaagagggtaggtattgttggcatcggtggtattgggcatatggggattctgttggctaaagctatgggagccgaggtttatgcgttttcgcgaggccactccaagcgggaggattctatgaaactcggtgctgatcactatattgctatgttggaggataaaggctggacagaacaatactctaacgctttggaccttatgtcgtttgctcatcatctttgtcgaaagttaattttgacagtatcgttaagattatgaagattggaggctccatcgtttcaattgctgctcctgaagttaatgaaaagcttgttttaaaaccgttgggcctaatgggagtatcaatctcaagcagtgctatcggatctaggaaggaaatcgaacaactattgaaattagtttccgaaaagaatgtcaaaatatgggtggaaaaacttccgatcagcgaagaaggcgtcagccatgcctttacaaggatggaaagcggagacgtcaaatacagatttactttggtcgattatgataagaaattccataaatag  4atgctaagttttgattattcaataccaactaaagttttttttggaaaaggaaaaatagacgtaattggagaagaaattaagaaatatggctcaagagtgcttatagtttatggcggaggaagtataaaaaggaacggtatatatgatagagcaacagctatattaaaagaaaacaatatagattctatgaactttcaggagtagagccaaatcctaggataacaacagtaaaaaaaggcatagaaatatgtagagaaaataatgtggatttagtattagcaatagggggaggaagtgcaatagactgttctaaggtaattgcagctggagtttattatgatggcgatacatgggacatggttaaagatccatctaaaataactaaagttatccaattgcaagtatacttactattcagcaacagggtctgaaatggatcaaattgcagtaatttcaaatatggagactaatgaaaagcttggagtaggacatgatgatatgagacctaaattttcagtgttagatcctacatatacttttacagtacctaaaaatcaaacagcagcgggaacagctgacattatgagtcacacctttgaatcttactttagtggtgttgaaggtgatatgtgcaggacggtatacgagaagcaatcttaagaacatgtataaagtatggaaaaatagcaatggagaagactgatgattacgaggctagagctaatttgatgtgggcttcaagtttagctataaatggtctattatcacttggtaaggatagaaaatggagttgtcatcctatggaacacgagttaagtgcatattatgatataacacatggtgtaggacttgcaattttaacacctaattggatggaatatattctaaatgacgatacacttcataaatttgtttcttatggaataaatgtttggggaatagacaagaacaaagataactatgaaatagcacgagaggctattaaaaatacgagagaatactttaattcattgggtattccttcaaagcttagagaagttggaataggaaaagataaactagaactaatggcaaagcaagctgttagaaattctggaggaacaataggaagtttaagaccaataaatgcagaggatgttcttgagatatttaaaaaatcttattaa  5atggttgatttcgaatattcaataccaactagaatttttttcggtaaagataagataaatgtacttggaagagagcttaaaaaatatggttctaaagtgcttatagtttatggtggaggaagtataaagagaaatggaatatatgataaagctgtaagtatacttgaaaaaaacagtattaaattttatgaacttgcaggagtagagccaaatccaagagtaactacagttgaaaaaggagttaaaatatgtagagaaaatggagttgaagtagtactagctataggtggaggaagtgcaatagattgcgcaaaggttatagcagcagcatgtgaatatgatggaaatccatgggatattgtgttagatggctcaaaaataaaaagggtgcttcctatagctagtatattaaccattgctgcaacaggatcagaaatggatacgtgggcagtaataaataatatggatacaaacgaaaaactaattgcggcacatccagatatggctcctaagttttctatattagatccaacgtatacgtataccgtacctaccaatcaaacagcagcaggaacagctgatattatgagtcatatatttgaggtgtattttagtaatacaaaaacagcatatttgcaggatagaatggcagaagcgttattaagaacttgtattaaatatggaggaatagctcttgagaagccggatgattatgaggcaagagccaatctaatgtgggcttcaagtcttgcgataaatggacttttaacatatggtaaagacactaattggagtgtacacttaatggaacatgaattaagtgatattacgacataacacacggcgtagggcttgcaattttaacacctaattggatggagtatattttaaataatgatacagtgtacaagtttgttgaatatggtgtaaatgtttggggaatagacaaagaaaaaaatcactatgacatagcacatcaagcaatacaaaaaacaagagattactttgtaaatgtactaggtttaccatctagactgagagatgttggaattgaagaagaaaaattggacataatggcaaaggaatcagtaaagcttacaggaggaaccataggaaacctaagaccagtaaacgcctccgaagtcctacaaatattcaaaaaatctgtgtaa  6atgaaagctctggtttatcacggtgaccacaagatctcgcttgaagacaagcccaagcccaccatcaaaagcccacggatgtagtagtacgggttttgaagaccacgatctgcggcacggatctcggcatctacaaaggcaagaatccagaggtcgccgacgggcgcatcctgggccatgaaggggtaggcgtcatcgaggaagtgggcgagagtgtcacgcagttcaagaaaggcgacaaggtcctgatttcctgcgtcacttatgcggctcgtgcgactactgcaagaagcagctttactcccattgccgcgacggcgggtggatcctgggttacatgatcgatggcgtgcaggccgaatacgtccgcatcccgcatgccgacaacagcctctacaagatcccccagacaattgacgacgaaatcgccgtcctgctgagcgacatcctgcccaccggccacgaaatcggcgtccagtatgggaatgtccagccgggcgatgcggtggctattgtcggcgcgggccccgtcggcatgtccgtactgttgaccgcccagttctactccccctcgaccatcatcgtgatcgacatggacgagaatcgcctccagctcgccaaggagctcggggcaacgcacaccatcaactccggcacggagaacgttgtcgaagccgtgcataggattgcggcagagggagtcgatgttgcgatcgaggcggtgggcataccggcgacttgggacatctgccaggagatcgtcaagcccggcgcgcacatcgccaacgtcggcgtgcatggcgtcaaggttgacttcgagattcagaagctctggatcaagaacctgacgatcaccacgggactggtgaacacgaacacgacgcccatgctgatgaaggtcgcctcgaccgacaagcttccgttgaagaagatgattacccatcgcttcgagctggccgagatcgagcacgcctatcaggtattcctcaatggcgccaaggagaaggcgatgaagatcatcctctcgaacgcaggcgctgcctga  7atggcggcgagctgcattttgctgcacaccggtcaaaagatgccgctgatcggtctgggcacctggaaatctgacccaggtcaagtgaaggcggcaattaagtatgcgctgagcgtcggttatcgtcacattgactgcgcggcaatctacggcaatgaaaccgagattggcgaggcgttgaaagagaacgtcggtccgggtaagctggtcccgcgtgaagaactgtttgtcacgagcaagctgtggaataccaagcaccacccggaggacgtggaaccggctctgcgcaaaaccctggccgatctgcagttggagtacttggatctgtatttgatgcactggccgtatgcgtttgaacgcggtgactctccgttcccgaagaacgccgacggcaccatccgttacgacagcactcattataaagaaacctggcgtgcgctggaggcgctggttgcaaaaggtctggtgcgtgccctgggtttgagcaattttaattctcgtcagatcgacgatgttctgagcgtggcctctgtgcgtccggctgtgttgcaggtcgagtgtcacccttatctggcgcaaaacgagctgatcgctcattgtcaagcgcgtaatctggaagtgaccgcgtactccccgctgggtagcagcgaccgcgcctggcgtgatccggaagaacctgttctgctgaaagaaccggtcgtgctggcgctggctgaaaagcacggtcgcagcccagcgcagatcttgctgcgttggcaagttcagcgcaaagtttcttgcatcccgaaatctgtcacgccgagccgtattctggagaacattcaagttttcgacttcacctttagcccggaagaaatgaagcagctggacgccctgaacaagaatctgcgttttattgtgccgatgttgaccgtggacggcaagcgcgttccgcgtgacgcgggtcacccgttgtatccatttaacgatccgtactaatga  8atgtgcaccgccggtaaagatattacgtgtaaagcggcggtcgcttgggagccgcataaaccgctgtccctggaaacgatcacggttgcacctccaaaagcgcatgaggtgcgtattaaaatcctggcgtctggcatctgcggtagcgacagcagcgttctgaaagagatcatcccgagcaagttcccggtgattctgggtcatgaggcggtgggcgtggttgagagcatcggtgcgggcgttacgtgcgtgaaaccgggtgacaaggtgatcccgctgttcgtgccgcaatgtggttcttgtcgcgcatgtaaaagcagcaatagcaacttctgtgagaagaatgatatgggcgcgaaaacgggtttgatggcagacatgaccagccgttttacgtgccgtggtaagccgatttataatctggtgggcaccagcacctttacggagtacacggttgtggccgatatcgcggtcgcaaagatcgacccaaaagccccgctggagagctgcctgatcggttgtggttttgcgacgggttatggtgcagcggttaacacggccaaagttacccctggcagcacctgtgcagtgtttggcctgggcggtgttggtttcagcgctattgttggttgtaaagcagctggcgcatcccgtattattggcgttggtactcataaggataagttcccgaaggcaatcgaactgggcgcaactgagtgcctgaatccgaaggactatgacaaaccgatctatgaggttatttgcgagaaaaccaatggcggtgtggattacgcggtcgagtgtgcgggtcgtattgaaactatgatgaacgcattgcagtcgacctattgcggttctggcgttactgttgtgttgggtctggcgagcccgaacgagcgtctgccgctggacccgttgttgctgctgacgggccgttccctgaaaggtagcgtgtttggcggctttaaaggtgaagaagttagccgtctggtggatgactacatgaagaagaagatcaatgttaatttcctggtgagcaccaaactgacgctggatcagatcaacaaagcgttcgaattgctgagcagcggtcaaggcgttcgtagcattatgatctactaatga  9atgaaaggtttcgctatgttgggtattaataagctgggttggattgagaaagagcgtccggtcgcaggcagctatgatgcaatcgttcgtccgttggccgttagcccgtgcacgagcgacattcatacggtgttcgagggtgcactgggtgaccgtaagaacatgatcctgggtcatgaggccgttggtgaagttgtcgaagtcggtagcgaagtcaaagattttaaaccgggcgaccgtgtcatcgttccatgcacgacgccagattggcgtagcctggaggtgcaggcaggtttccagcagcatagcaatggcatgctggctggctggaaattctctaatttcaaggatggtgtgttcggtgaatatttccacgtgaacgacgctgacatgaacctggctatcctgccgaaggatatgccgctggagaacgcggtgatgatcacggatatgatgactacgggttttcatggtgcggagctggcggacatccaaatgggtagcagcgtggtcgtcatcggcatcggcgctgtgggtctgatgggcattgcaggcgcaaaactgcgcggtgcgggtcgtatcatcggtgtgggtagccgccctatctgcgtggaggcggcgaagttttacggtgcgactgacattctgaactataagaacggtcacattgttgatcaagtgatgaagctgaccaacggtaaaggcgtggatcgcgttatcatggcgggtggtggttcggaaacgctgagccaggcagttagcatggtcaagccgggtggcattatcagcaatattaattaccacggtagcggtgatgcgctgctgatcccacgtgtcgagtggggttgtggtatggcacacaagaccattaaaggcggtctgtgcccgggtggtcgtttgcgtgcggaaatgctgcgtgatatggttgtctataaccgtgttgacctgagcaagctggtgacgcacgtctatcacggctttgaccatatcgaagaggcgttgctgctgatgaaggataaaccgaaggacctgattaaagcggtcgtgatcctgtaatga 10atgaagggcctggcgatgctgggtatcggtcgtattggttggattgaaaagaaaatcccggagtgcggcccactggatgcgttggtccgtccgctggcgctggccccgtgcaccagcgacacccacaccgtgtgggctggcgcaatcggcgaccgtcacgacatgattctgggtcacgaagcggtcggtcagatcgtgaaggtgggttccctggtgaagcgtctgaaggttggcgataaggtgatcgtcccggcgattactccggactggggtgaagaagaaagccaacgtggttacccgatgcatagcggtggtatgctgggcggctggaagttctccaatttcaaggacggtgtcttttccgaggtgttccacgtgaacgaggcggatgctaacctggcactgctgccgcgtgatattaaacctgaagatgcggtcatgctgagcgacatggtgaccaccggctttcacggtgccgaattggcgaatattaaactgggtgataccgtgtgcgttattggtatcggcccagtgggtctgatgagcgtggctggtgcgaatcacctgggtgccggtcgcatcttcgcggttggtagccgcaaacactgttgtgatatcgctctggaatacggcgcgactgatattatcaattacaagaatggcgacattgtggagcaaattttgaaggcgaccgatggtaaaggcgttgacaaggttgttattgcaggtggcgatgttcatacgtttgcacaagcggtcaagatgattaaaccgggtagcgatattggtaacgtgaattatctgggtgaaggcgataacattgacattccgcgtagcgaatggggtgtgggcatgggtcataaacacatccacggtggtttgactcctggcggtcgtgtccgcatggaaaagttggcttcgctgattagcaccggcaaactggacaccagcaaactgattactcatcgtttcgagggcctggagaaggtggaagatgccttgatgctgatgaagaacaagccggcagatctgattaagccggttgtccgtattcactatgacgatgaagatacgttgcactaatga 11atgaaagcactggtttaccgtggccctggccaaaagctggtggaagaacgtcaaaagccggagctgaaagagccaggcgacgcgattgtgaaagtcaccaaaacgaccatctgtggtacggacttgcacattctgaagggcgatgtggcgacgtgtaagccgggtcgcgtgctgggtcacgaaggtgtgggtgttattgaaagcgttggcagcggcgttaccgcgttccaaccgggtgatcgcgtcctgatctcttgtatttctagctgtggcaagtgcagatttgtcgccgtggcatgtttagccactgtaccactggcggctggattctgggtaatgagattgacggtacgcaggcagagtacgttcgtgtcccgcatgccgacacctctctgtatcgtattccagcgggtgcggacgaagaggcgctggtgatgctgagcgatatcctgccgaccggtttcgagtgtggtgtcctgaatggtaaggttgcgcctggcagcagcgttgcgatcgttggcgcaggccctgtcggtttggccgcattgctgacggcgcagttctactctccggcagagattatcatgattgatctggacgacaaccgcctgggcctggcgaagcaattcggcgcaacgcgtaccgttaatagcaccggtggtaacgcagcagcagaggtcaaggctctgacggagggcctgggtgttgacacggctattgaggctgttggcatcccggccaccttcgagctgtgccagaacattgtggctccgggtggcactattgcgaatgtcggcgttcacggttcgaaagtggatctgcatctggaatctctgtggagccataatgtgactatcacgacgcgtctggtggacacggcaacgacgccgatgctgctgaaaaccgtgcaatctcataaactggacccgagccgtctgatcacccatcgttttagcctggaccaaatcctggatgcgtacgaaacgtttggtcaggccgcaagcacccaggcgctgaaggttattatcagcatggaggcgtaatga 12atgagcaccgcaggtaaagtgattaaatgcaaagcagcagttctgtgggaaccgcataaaccgtttaccattgaagatattgaagttgcacctccgaaagcacatgaagtgcgcattaaaatggttgcaaccggtgtttgtcgttctgatgatcatgcagttagcggtagcctgtttacaccgctgcctgcagttctgggtcatgaaggtgcaggtattgttgaaagcattggtgaaggtgttacctgtgttaaaccgggtgataaagtgattccgctgttttctccgcagtgtggtaaatgtcgcatttgcaaacatccggaaagcaatctgtgttgccagaccaaaaatctgacccagccgaaaggtgcactgctggatggcaccagccgttttagctgtcgtggtaaaccgattcatcattttattagcaccagcacctttagccagtataccgtggttgatgatattgccgtggcaaaaattgatgcagcagcaccgctggataaagtttgtctgattggttgtggttttagcaccggttatggtagcgcagttcaggttgcaaaagttacaccgggtagcacctgtgcagtttttggtctgggtggtgttggtctgagcgttgttattggttgtaaaaccgcaggcgcagcaaaaattattgccgtggatattaataaagataaatttgccaaagccaaagaactgggtgcaaccgattgtattaatccgcaggattataccaaaccgattcaggaagttctgcaggaaatgaccgatggtggtgtggattttagctttgaagtgattggtcgtctggataccatgaccagcgcactgctgagctgtcatagcgcatgtggtgttagcgttattgttggtgttcctccgagcgcacagagcctgagcgttaatccgatgagcctgctgctgggtcgtacctggaaaggtgcaatttttggtggctttaaaagcaaagatgccgttccgaaactggttgcagattttatggccaaaaaatttccgctggaaccgctgattacccatgttctgccgtttgaaaaaattaatgaagcctttgatctgctgcgtgcaggtaaaagcattcgtaccgtgctgaccttttaataa 13atgcgtgcagttgtgtttgaaaacaaagaacgcgtggccgttaaagaagttaacgcaccgcgtctgcagcatccgctggatgcactggttcgtgttcatctggcaggtatttgtggtagcgatctgcatctgtatcatggtaaaattccggttctgcctggtagcgttctgggtcatgaatttgttggtcaggttgaagcagttggtgaaggtattcaggatctgcagcctggtgattgggttgttggtccgtttcatattgcatgtggcacctgtccgtattgtcgtcgtcatcagtataatctgtgtgaacgtggtggtgtttatggttatggtccgatgtttggtaatctgcagggtgcacaggcagaaattctgcgtgttccgtttagcaatgtgaatctgcgtaaactgcctccgaatctgtctccggaacgtgcaatttttgccggtgatattctgagcaccgcctatggtggtctgattcagggtcagctgcgtcctggtgatagcgttgcagttattggtgcaggtccggttggtctgatggcaattgaagttgcacaggttctgggtgcaagcaaaattctggccattgatcgtattccggaacgtctggaacgtgcagcaagcctgggtgcaattccgattaatgccgaacaggaaaatccggttcgtcgcgttcgtagcgaaaccaatgatgaaggtccggatctggttctggaagccgttggtggtgcagcaaccctgagcctggcactggaaatggttcgtcctggtggtcgtgttagcgcagttggtgttgataatgcaccgagctttccgtttccgctggcaagcggtctggttaaagatctgacgtttcgtattggtctggcaaatgtgcatctgtatattgatgcagttctggcactgctggccagcggtcgtctgcagccggaacgtattgttagccattatctgccgctggaagaagcacctcgcggttacgaactgtttgatcgcaaagaagcactgaaagttctgctggttgtgcgtggttaataa 14atgaaagcactggtttatggtggtccgggtcagaaaagcctggaagatcgtccgaaaccggaactgcaggcaccgggtgatgcaattgttcgtattgtgaaaaccaccatttgtggcaccgatctgcatattctgaaaggtgatgttgcaacctgtgcaccgggtcgtattctgggtcatgaaggtgttggtattgttgatagcgttggtgcagcagttaccgcatttcgtccgggtgatcatgttctgattagctgtattagcgcctgtggtaaatgtgattattgccgtcgtggtatgtatagccattgtacaaccggtggatggattctgggtaatgaaattgatggcacccaggcagaatatgttcgtacaccgcatgcagataccagcctgtatccggttccggcaggcgcagatgaagaggcactggttatgctgagcgatattctgccgaccggttttgaatgtggtgtgctgaatggtaaagttgcaccgggtggcaccgttgcaattgttggtgcaggtccgattggtctggcagcactgctgaccgcacagttttattctccggcagaaattattatgattgatctggatgataatcgtctgggtattgcacgtcagtttggtgcaacccagaccattaatagcggtgatggtcgtgcagcagaaaccgttaaagcactgaccggtggtcgtggtgttgataccgcaattgaagcagttggtgttccggcaacctttgaactgtgtcaggatctggttggtcctggtggtgttattgcaaatattggtgtgcatggtcgtaaagttgatctgcatctggatcgtctgtggagccagaatattgcaattaccacccgtctggttgataccgttagcaccccgatgctgctgaaaaccgttcagagccgtaaactggacccgagccagctgattacccatcgttttcgcctggatgaaattctggcagcctatgatacctttgcacgtgcagcagatacccaggcactgaaagttattattgcagcctaataa 15atgaaagcactggtttatcatggtccgggtcagaaagcactggaagaacgtccgaaaccgcagattgaagcaagcggtgatgccattgttaaaattgtgaaaaccaccatttgtggcaccgatctgcatattctgaaaggtgatgttgcaacctgtgcaccgggtcgtattctgggtcatgaaggtgtgggtattattgatagcgttggtgccggtgttaccgcatttcagcctggtgatcgtgttctgattagctgtattagcagctgtggcaaatgtgattattgtcgtcgtggtctgtatagccattgtacaaccggtggttggattctgggtaatgaaattgatggcacccaggcagaatatgttcgtacaccgcatgcagataccagcctgtatcgtattccggcaggcgcagatgaagaggcactggttatgctgagcgatattctgccgaccggttttgaatgtggtgtgctgaatggtaaagttgaaccgggtagcaccgttgcaattgttggtgcaggtccgattggtctggcagcactgctgaccgcacagttttatgcaccgggtgatattattatgattgatctggatgataatcgtctggatgttgcacgtcgttttggtgcaacccataccattaatagcggtgatggtaaagcagcagaagcagttaaagcactgaccggtggtattggtgttgataccgcaattgaagccgttggtattccggcaacctttctgctgtgtgaagatattgttgcaccgggtggtgttattgcaaatgttggtgtgcatggtgttaaagttgatctgcatctggaacgtctgtgggcacataatattaccattaccacccgtctggttgataccgttaccaccccgatgctgctgaaaaccgttcagagcaaaaaactggacccgctgcagctgattacccatcgttttaccctggatcatattctggatgcctatgatacctttagccgtgcagcagataccaaagccctgaaagttattgtgagcgcctaataa 16atggaaaatattatgaaagcaatggtgtattatggcgatcatgatattcgttttgaagaacgcaaaaaaccggaactgattgatccgaccgatgccattattaaaatgaccaaaaccaccatttgtggcaccgatctgggtatttataaaggcaaaaatccggaaattgaacagaaagaacaggaaaaaaacggcagctttaatggtcgtattctgggtcatgaaggtattggtattgtggagcagattggtagcagcgtgaaaaacattaaagtgggcgataaagttattgttagctgcgttagccgttgtggcacctgtgaaaattgtgccaaacagctgtatagccattgtcgtaatgatggtggttggattatgggctatatgattgatggcacccaggcagaatatgttcgtaccccgtttgcagataccagcctgtatgttctgccggaaggtctgaatgaagatgttgcagttctgctgtctgatgcactgccgaccgcacatgaaattggtgttcagaatggcgatattaaaccgggtgataccgttgcaattgttggtgcaggtccggttggtatgagcgcactgctgaccgctcagttttatagcccgagccagattattatgattgatatggatgaaaatcgtctggcaatggcaaaagaactgggtgcaaccgataccattaatagcggcaccgaagatgcaattgcacgtgttatggaactgaccaatcagcgtggtgttgattgtgcaattgaagccgttggtattgaaccgacctgggatatttgtcagaatattgtgaaagaaggtggtcatctggcaaatgttggtgttcatggcaaaagcgtgaattttagcctggaaaaactgtggattaaaaatctgaccattaccaccggtctggttaatgcaaataccaccggtatgctgctgaaaagctgttgtagcggtaaactgccgatggaaaaactggcaacccatcattttaaatttaatgaaattgaaaaggcctatgatgtgtttattaatgcagccaaagaaaaagccatgaaagtgattattgatttttaataa 17atgaaagcactgacctatctgggtccgggtaaaaaagaagtgatggaaaaaccgaaaccgaaaattgaaaaagaaaccgatgccattgtgaaaattaccaaaaccaccatttgtggcaccgatctgcatattctgagcggtgatgttccgaccgttgaagaaggtcgtattctgggtcatgaaggtgtgggtattattgaagaagttggctctggcgttaaaaattttaaaaaaggcgatcgcgttctgattagctgtattaccagctgtggcaaatgcgaaaattgcaaaaaaggcctgtatgcccattgtgaagatggtggttggattctgggccatctgattgatggcacccaggcagaatatgttcgtattccgcatgcagataatagcctgtatccgattccggaaggtgttgatgaagaggcactggttatgctgagcgatattctgccgaccggttttgaaattggtgtgctgaatggtaaagttcagcctggtcagaccgttgcaattattggtgcaggtccggttggtatggcagcactgctgaccgcacagttttattctccggcagaaattattatggtggatctggatgataatcgtctggaagtggccaaaaaatttggtgcaacccaggttgttaatagcgcagatggtaaagccgtggaaaaaattatggaactgaccggtggcaaaggtgtggatgttgcaatggaagcagttggtattccggtgacctttgatatttgccaggaaattgttaaacctggcggttatattgcaaatattggcgtgcatggtaaaagcgtggaatttcatattgaaaaactgtggattcgcaacattaccctgaccaccggtctggttaataccacctctaccccgatgctgctgaaaaccgttcagagcaaaaaactgaaaccggaacagctgattacccatcgttttgcctttgccgatattatgaaagcctatgaagtgtttggtaatgcagccaaagaaaaagccctgaaagtgattattagcaatgattaataa 18atgagctatccggaaaaatttcagggtattggcattaccaatcgcgaagattggaaacatccgaaaaaagtgacctttgaaccgaaacagtttaatgataaagatgtggatattaaaattgaagcctgcggtgtttgtggttctgatgttcattgtgcagcaagccattggggtccggttgcagaaaaacaggttgtgggccatgaaattattggtcgtgtgctgaaagttggtccgaaatgtaccaccggtattaaagttggtgatcgtgttggtgttggtgcacaggcatggtcttgtctggaatgtagccgttgcaaaagcgataatgaaagctattgtccgaaaagcgtttggacctatagcattccgtatattgatggttatgttagccagggtggttatgcaagccatattcgcctgcatgaacattttgcaattccgattccggataaactgagcaatgaactggcagcaccgctgctgtgtggtggtattaccgtttattctccgctgctgcgtaatggttgtggtccgggtaaaaaagttggtattgtgggcattggtggtattggtcacatgggtctgctgtttgcaaaaggtatgggtgccgaagtttatgcatttagccgcacccatagcaaagaggcagacgccaaaaaactgggtgccgatcattttattgcaaccctggaagataaagattggaccaccaaatattttgataccctggatctgctggttatttgtgcaagcagcctgaccgatattaattttgatgaactgaccaaaattatgaaagtgaataccaaaattattagcattagcgcaccggcagcagatgaagttctgaccctgaaaccgtttggtctgattggtgtgaccattggtaatagcgcaattggtagccgtcgtgaaattgaacatctgctgaattttgtggccgaaaaagatattaaaccgtgggttgaaaccctgccggttggtgaagccggtgttaatgaagcatttgaacgcatggataaaggtgatgtgaaatatcgttttaccctggtggattttgataaagaatttggcaattaataa 19atgagcgaagaaacctttaccgcatgggcatgtaaaagcaaaagcgcaccgctggaaccgatggaaatgaccttttgccattgggatgatgatatggttcagatggatgttatttgttgtggtgtttgtggcaccgatctgcataccgttgatgaaggttggggtccgaccgaatttccgtgtgttgtgggccatgaaattattggcaatgtgaccaaagtgggtaaaaatgtgacccgtattaaagttggtgatcgttgtggtgttggttgtcagagcgcaagctgtggtaaatgcgatttttgcaaaaaaggcatggaaaatctgtgtagcacccatgcagtttggacctttaatgatcgctatgataatgccaccaaagataaaacctatggtggctttgcaaaaaaatggcgtggcaatcaggattttgttgttcatgtgccgatggatttttctccggaagttgcagcaagctttctgtgtggtggtgttaccacctatgcaccgctgaaacgttatggtgttggtaaaggtagcaaagttgcagttctgggtctgggtggtctgggccattttggtgttcagtgggcaaaagcaatgggtgcagaagttgttgcctttgacgtgattccggataaagtggatgatgccaaaaaactgggctgtgatgattatgttctgatgcagaaagaagagcagatggaaccgcattataatacctttacccatattctggccaccaaaattgtgaataaatgctgggatcagtattttaaaatgctgaaaaataatggcatttttatgctgtgcgatattccggaagttccgctgagcggtatgagcgcatttgttatggcaggtaaacagctgaccattgcaggcacctttattggtagcccgagcgttattcaggaatgtctggattttgcagccaagcataatgttcgtacctgggttaatacctttccgatggaaaaaattaatgaagcctttgaatttgttcgtcaggcaaaaccgcgttatcgtgccgttgtgatgaattaataa 20atgtttaccgttaatgcacgtagcaccagcgcaccgggtgcaccgtttgaagcagttgttattgaacgtcgtgatccgggtccgggtgatgttgttattgatattgcctttagcggtatttgtcataccgatgttagccgtgcacgtagcgaatttggcaccacccattatccgctggttccgggtcatgaaattgccggtgttgttagcaaagttggttccgatgttaccaaatttgcagttggtgatcgtgttggtgttggttgtattgttgatagctgccgtgaatgtgattattgtcgtgcaggtctggaaccgtattgtcgtaaagatcatgtgcgcacctataatagcatgggtcgtgatggtcgtattaccctgggtggttatagcgaaaaaattgtggtggatgaaggttatgttctgcgtattccggatgcaattccgctggatcaggcagcaccgctgctgtgtgcaggtattaccatgtattctccgctgcgtcattggaaagcaggtccgggtagccgtattgcaattgttggttttggtggtctgggtcatgttggtgttgcaattgcacgtgcactgggtgcacataccaccgtttttgatctgacgatggataaacatgatgatgcaattcgtctgggtgcagatgattatcgtctgagcaccgatgcaggcatttttaaagaatttgaaggtgcctttgaactgattgttagcaccgttccggcaaatctggattatgacctgtttctgaaaatgctggcactggatggcacctttgttcagctgggtgttccgcataatccggttagcctggatgtttttagcctgttttataatcgtcgtagcctggcaggcaccctggttggtggtattggtgaaacccaggaaatgctggatttttgcgcagaacatagcattgttgccgaaattgaaaccgttggtgccgatgaaattgatagcgcctatgatcgtgttgcagccggtgatgttcgttatcgtatggttctggatgttggcaccctggcaacccagcgttaataa

TABLE 5 SEQ ID NO POLYPEPTIDE SEQUENCE 21MSTAGKVIKCKAAVLWEEKKPFSIEEVEVAPPKAHEVRIKMVATGICRSDDHVVSGTLVTPLPVIAGHEAAGIVESIGEGVTTVRPGDKVIPLFTPQCGKCRVCKHPEGNFCLKNDLSMPRGTMQDGTSRFTCRGKPIHHFLGTSTFSQYTVVDEISVAKIDAASPLEKVCLIGCGFSTGYGSAVKVAKVTQGSTCAVFGLGGVGLSVIMGCKAAGAARIIGVDINKDKFAKAKEVGATECVNPQDYKKPIQEVLTEMSNGGVDFSFEVIGRLDTMVTALSCCQEAYGVSVIVGVPPDSQNLSMNPMLLLSGRTWKGAIFGGFKSKDSVPKLVADFMAKKFALDPLITHVLPFEKINEGFDL LRSGESIRTILTF 22MSYPEKFEGIAIQSHEDWKNPKKTKYDPKPFYDHDIDIKIEACGVCGSDIHCAAGHWGNMKMPLVVGHEIVGKVVKLGPKSNSGLKVGQRVGVGAQVFSCLECDRCKNDNEPYCTKFVTTYSQPYEDGYVSQGGYANYVRVHEHFVVPIPENIPSHLAAPLLCGGLTVYSPLVRNGCGPGKKVGIVGLGGIGSMGTLISKAMGAETYVISRSSRKREDAMKMGADHYIATLEEGDWGEKYFDTFDLIVVCASSLTDIDFNIMPKAMKVGGRIVSISIPEQHEMLSLKPYGLKAVSISYSALGSIKELNQLLKLVSEKDIKIWVETLPVGEAGVHEAFERMEKGDVRYRFTLVGYDKEFSD 23MLYPEKFQGIGISNAKDWKHPKLVSFDPKPFGDHDVDVEIEACGICGSDFHIAVGNWGPVPENQILGHEIIGRVVKVGSKCHTGVKIGDRVGVGAQALACFECERCKSDNEQYCTNDHVLTMWTPYKDGYISQGGFASHVRLHEHFAIQIPENIPSPLAAPLLCGGITVFSPLLRNGCGPGKRVGIVGIGGIGHMGILLAKAMGAEVYAFSRGHSKREDSMKLGADHYIAMLEDKGWTEQYSNALDLLVVCSSSLSKVNFDSIVKIMKIGGSIVSIAAPEVNEKLVLKPLGLMGVSISSSAIGSRKEIEQLLKLVSEKNVKIWVEKLPISEEGVSHAFTRMESGDVKYRFTLVDYDKKFHK 24MLSFDYSIPTKVFFGKGKIDVIGEEIKKYGSRVLIVYGGGSIKRNGIYDRATAILKENNIAFYELSGVEPNPRITTVKKGIEICRENNVDLVLAIGGGSAIDCSKVIAAGVYYDGDTWDMVKDPSKITKVLPIASILTLSATGSEMDQIAVISNMETNEKLGVGHDDMRPKFSVLDPTYTFTVPKNQTAAGTADIMSHTFESYFSGVEGAYVQDGIREAILRTCIKYGKIAMEKTDDYEARANLMWASSLAINGLLSLGKDRKWSCHPMEHELSAYYDITHGVGLAILTPNWMEYILNDDTLHKFVSYGINVWGIDKNKDNYEIAREAIKNTREYFNSLGIPSKLREVGIGKDKLELMAKQAVRN SGGTIGSLRPINAEDVLEIFKKSY25 MVDFEYSIPTRIFFGKDKINVLGRELKKYGSKVLIVYGGGSIKRNGIYDKAVSILEKNSIKFYELAGVEPNPRVTTVEKGVKICRENGVEVVLAIGGGSAIDCAKVIAAACEYDGNPWDIVLDGSKIKRVLPIASILTIAATGSEMDTWAVINNMDTNEKLIAAHPDMAPKFSILDPTYTYTVPTNQTAAGTADIMSHIFEVYFSNTKTAYLQDRMAEALLRTCIKYGGIALEKPDDYEARANLMWASSLAINGLLTYGKDTNWSVHLMEHELSAYYDITHGVGLAILTPNWMEYILNNDTVYKFVEYGVNVWGIDKEKNHYDIAHQAIQKTRDYFVNVLGLPSRLRDVGIEEEKLDIMAKESVKLTGGTIGNLRPVNASEVLQIFKKSV 26MKALVYHGDHKISLEDKPKPTLQKPTDVVVRVLKTTICGTDLGIYKGKNPEVADGRILGHEGVGVIEEVGESVTQFKKGDKVLISCVTSCGSCDYCKKQLYSHCRDGGWILGYMIDGVQAEYVRIPHADNSLYKIPQTIDDEIAVLLSDILPTGHEIGVQYGNVQPGDAVAIVGAGPVGMSVLLTAQFYSPSTIIVIDMDENRLQLAKELGATHTINSGTENVVEAVHRIAAEGVDVAIEAVGIPATWDICQEIVKPGAHIANVGVHGVKVDFEIQKLWIKNLTITTGLVNTNTTPMLMKVASTDKLPLKKMITHRFELAEIEHAYQVFLNGAKEKAMKIILSNAGAA 27MAASCILLHTGQKMPLIGLGTWKSDPGQVKAAIKYALSVGYRHIDCAAIYGNETEIGEALKENVGPGKLVPREELFVTSKLWNTKHHPEDVEPALRKTLADLQLEYLDLYLMHWPYAFERGDSPFPKNADGTIRYDSTHYKETWRALEALVAKGLVRALGLSNFNSRQIDDVLSVASVRPAVLQVECHPYLAQNELIAHCQARNLEVTAYSPLGSSDRAWRDPEEPVLLKEPVVLALAEKHGRSPAQILLRWQVQRKVSCIPKSVTPSRILENIQVFDFTFSPEEMKQLDALNKNLRFIV PMLTVDGKRVPRDAGHPLYPFNDPY 28MCTAGKDITCKAAVAWEPHKPLSLETITVAPPKAHEVRIKILASGICGSDSSVLKEIIPSKFPVILGHEAVGVVESIGAGVTCVKPGDKVIPLFVPQCGSCRACKSSNSNFCEKNDMGAKTGLMADMTSRFTCRGKPIYNLVGTSTFTEYTVVADIAVAKIDPKAPLESCLIGCGFATGYGAAVNTAKVTPGSTCAVFGLGGVGFSAIVGCKAAGASRIIGVGTHKDKFPKAIELGATECLNPKDYDKPIYEVICEKTNGGVDYAVECAGRIETMMNALQSTYCGSGVTVVLGLASPNERLPLDPLLLLTGRSLKGSVFGGFKGEEVSRLVDDYMKKKINVNFLVSTKLTLDQINKAFELLSSG QGVRSIMIY 29MKGFAMLGINKLGWIEKERPVAGSYDAIVRPLAVSPCTSDIHTVFEGALGDRKNMILGHEAVGEVVEVGSEVKDFKPGDRVIVPCTTPDWRSLEVQAGFQQHSNGMLAGWKFSNFKDGVFGEYFHVNDADMNLAILPKDMPLENAVMITDMMTTGFHGAELADIQMGSSVVVIGIGAVGLMGIAGAKLRGAGRIIGVGSRPICVEAAKFYGATDILNYKNGHIVDQVMKLTNGKGVDRVIMAGGGSETLSQAVSMVKPGGIISNINYHGSGDALLIPRVEWGCGMAHKTIKGGLCPGGRLRAEMLRDMVVYNRVDLSKLVTHVYHGFDHIEEALLLMKDKPKDLIKAVVIL 30MKGLAMLGIGRIGWIEKKIPECGPLDALVRPLALAPCTSDTHTVWAGAIGDRHDMILGHEAVGQIVKVGSLVKRLKVGDKVIVPAITPDWGEEESQRGYPMHSGGMLGGWKFSNFKDGVFSEVFHVNEADANLALLPRDIKPEDAVMLSDMVTTGFHGAELANIKLGDTVCVIGIGPVGLMSVAGANHLGAGRIFAVGSRKHCCDIALEYGATDIINYKNGDIVEQILKATDGKGVDKVVIAGGDVHTFAQAVKMIKPGSDIGNVNYLGEGDNIDIPRSEWGVGMGHKHIHGGLTPGGRVRMEKLASLISTGKLDTSKLITHRFEGLEKVEDALMLMKNKPADLIKPVVRIHYDDED TLH 31MKALVYRGPGQKLVEERQKPELKEPGDAIVKVTKTTICGTDLHILKGDVATCKPGRVLGHEGVGVIESVGSGVTAFQPGDRVLISCISSCGKCSFCRRGMFSHCTTGGWILGNEIDGTQAEYVRVPHADTSLYRIPAGADEEALVMLSDILPTGFECGVLNGKVAPGSSVAIVGAGPVGLAALLTAQFYSPAEIIMIDLDDNRLGLAKQFGATRTVNSTGGNAAAEVKALTEGLGVDTAIEAVGIPATFELCQNIVAPGGTIANVGVHGSKVDLHLESLWSHNVTITTRLVDTATTPMLLKTVQSHKLDPSRLITHRFSLDQILDAYETFGQAASTQALKVIISMEA 32MSTAGKVIKCKAAVLWEPHKPFTIEDIEVAPPKAHEVRIKMVATGVCRSDDHAVSGSLFTPLPAVLGHEGAGIVESIGEGVTCVKPGDKVIPLFSPQCGKCRICKHPESNLCCQTKNLTQPKGALLDGTSRFSCRGKPIHHFISTSTFSQYTVVDDIAVAKIDAAAPLDKVCLIGCGFSTGYGSAVQVAKVTPGSTCAVFGLGGVGLSVVIGCKTAGAAKIIAVDINKDKFAKAKELGATDCINPQDYTKPIQEVLQEMTDGGVDFSFEVIGRLDTMTSALLSCHSACGVSVIVGVPPSAQSLSVNPMSLLLGRTWKGAIFGGFKSKDAVPKLVADFMAKKFPLEPLITHVLPFEKINEAFDLLR AGKSIRTVLTF 33MRAVVFENKERVAVKEVNAPRLQHPLDALVRVHLAGICGSDLHLYHGKIPVLPGSVLGHEFVGQVEAVGEGIQDLQPGDWVVGPFHIACGTCPYCRRHQYNLCERGGVYGYGPMFGNLQGAQAEILRVPFSNVNLRKLPPNLSPERAIFAGDILSTAYGGLIQGQLRPGDSVAVIGAGPVGLMAIEVAQVLGASKILAIDRIPERLERAASLGAIPINAEQENPVRRVRSETNDEGPDLVLEAVGGAATLSLALEMVRPGGRVSAVGVDNAPSFPFPLASGLVKDLTFRIGLANVHLYIDAVLALLASGRLQPERIVSHYLPLEEAPRGYELFDRKEALKVLLVVRG 34MKALVYGGPGQKSLEDRPKPELQAPGDAIVRIVKTTICGTDLHILKGDVATCAPGRILGHEGVGIVDSVGAAVTAFRPGDHVLISCISACGKCDYCRRGMYSHCTTGGWILGNEIDGTQAEYVRTPHADTSLYPVPAGADEEALVMLSDILPTGFECGVLNGKVAPGGTVAIVGAGPIGLAALLTAQFYSPAEIIMIDLDDNRLGIARQFGATQTINSGDGRAAETVKALTGGRGVDTAIEAVGVPATFELCQDLVGPGGVIANIGVHGRKVDLHLDRLWSQNIAITTRLVDTVSTPMLLKTVQSRKLDPSQLITHRFRLDEILAAYDTFARAADTQALKVIIAA 35MKALVYHGPGQKALEERPKPQIEASGDAIVKIVKTTICGTDLHILKGDVATCAPGRILGHEGVGIIDSVGAGVTAFQPGDRVLISCISSCGKCDYCRRGLYSHCTTGGWILGNEIDGTQAEYVRTPHADTSLYRIPAGADEEALVMLSDILPTGFECGVLNGKVEPGSTVAIVGAGPIGLAALLTAQFYAPGDIIMIDLDDNRLDVARRFGATHTINSGDGKAAEAVKALTGGIGVDTAIEAVGIPATFLLCEDIVAPGGVIANVGVHGVKVDLHLERLWAHNITITTRLVDTVTTPMLLKTVQSKKLDPLQLITHRFTL DHILDAYDTFSRAADTKALKVIVSA36 MENIMKAMVYYGDHDIRFEERKKPELIDPTDAIIKMTKTTICGTDLGIYKGKNPEIEQKEQEKNGSFNGRILGHEGIGIVEQIGSSVKNIKVGDKVIVSCVSRCGTCENCAKQLYSHCRNDGGWIMGYMIDGTQAEYVRTPFADTSLYVLPEGLNEDVAVLLSDALPTAHEIGVQNGDIKPGDTVAIVGAGPVGMSALLTAQFYSPSQIIMIDMDENRLAMAKELGATDTINSGTEDAIARVMELTNQRGVDCAIEAVGIEPTWDICQNIVKEGGHLANVGVHGKSVNFSLEKLWIKNLTITTGLVNANTTGMLLKSCCSGKLPMEKLATHHFKFNEIEKAYDVFINAAKEKAMKVIIDF 37MKALTYLGPGKKEVMEKPKPKIEKETDAIVKITKTTICGTDLHILSGDVPTVEEGRILGHEGVGIIEEVGSGVKNFKKGDRVLISCITSCGKCENCKKGLYAHCEDGGWILGHLIDGTQAEYVRIPHADNSLYPIPEGVDEEALVMLSDILPTGFEIGVLNGKVQPGQTVAIIGAGPVGMAALLTAQFYSPAEIIMVDLDDNRLEVAKKFGATQVVNSADGKAVEKIMELTGGKGVDVAMEAVGIPVTFDICQEIVKPGGYIANIGVHGKSVEFHIEKLWIRNITLTTGLVNTTSTPMLLKTVQSKKLKPEQLITHRFAFADIMKAYEVFGNAAKEKALKVIISND 38MSYPEKFQGIGITNREDWKHPKKVTFEPKQFNDKDVDIKIEACGVCGSDVHCAASHWGPVAEKQVVGHEIIGRVLKVGPKCTTGIKVGDRVGVGAQAWSCLECSRCKSDNESYCPKSVWTYSIPYIDGYVSQGGYASHIRLHEHFAIPIPDKLSNELAAPLLCGGITVYSPLLRNGCGPGKKVGIVGIGGIGHMGLLFAKGMGAEVYAFSRTHSKEADAKKLGADHFIATLEDKDWTTKYFDTLDLLVICASSLTDINFDELTKIMKVNTKIISISAPAADEVLTLKPFGLIGVTIGNSAIGSRREIEHLLNFVAEKDIKPWVETLPVGEAGVNEAFERMDKGDVKYRFTLVDFDKEFGN 39MSEETFTAWACKSKSAPLEPMEMTFCHWDDDMVQMDVICCGVCGTDLHTVDEGWGPTEFPCVVGHEIIGNVTKVGKNVTRIKVGDRCGVGCQSASCGKCDFCKKGMENLCSTHAVWTFNDRYDNATKDKTYGGFAKKWRGNQDFVVHVPMDFSPEVAASFLCGGVTTYAPLKRYGVGKGSKVAVLGLGGLGHFGVQWAKAMGAEVVAFDVIPDKVDDAKKLGCDDYVLMQKEEQMEPHYNTFTHILATKIVNKCWDQYFKMLKNNGIFMLCDIPEVPLSGMSAFVMAGKQLTIAGTFIGSPSVIQECLDFAAKHNVRTWVNTFPMEKINEAFEFVRQAKPRYRAVVMN 40MFTVNARSTSAPGAPFEAVVIERRDPGPGDVVIDIAFSGICHTDVSRARSEFGTTHYPLVPGHEIAGVVSKVGSDVTKFAVGDRVGVGCIVDSCRECDYCRAGLEPYCRKDHVRTYNSMGRDGRITLGGYSEKIVVDEGYVLRIPDAIPLDQAAPLLCAGITMYSPLRHWKAGPGSRIAIVGFGGLGHVGVAIARALGAHTTVFDLTMDKHDDAIRLGADDYRLSTDAGIFKEFEGAFELIVSTVPANLDYDLFLKMLALDGTFVQLGVPHNPVSLDVFSLFYNRRSLAGTLVGGIGETQEMLDFCAEHSIVAEIETVGADEIDSAYDRVAAGDVRYRMVLDVGTLATQR

In one embodiment, the method for screening candidate polypeptideshaving alcohol dehydrogenase activity comprises:

-   -   (a) measuring the rate of cofactor oxidation by a lower alkyl        aldehyde for the candidate polypeptides in the presence or        absence of a lower alkyl alcohol; and    -   (b) selecting only those candidate polypeptides that oxidize a        cofactor faster relative to a control polypeptide in the        presence or absence of a lower alkyl alcohol. In one        embodiment, (b) comprises selecting only those candidate        polypeptides that oxidize a cofactor faster relative to a        control polypeptide in both the presence and absence of a lower        alkyl alcohol. In one embodiment, the cofactor is NADH. In        another embodiment, the cofactor is NADPH. In yet another        embodiment, the control polypeptide is HLADH having the amino        acid sequence of SEQ ID NO: 21. In yet another embodiment, the        control polypeptide is Achromobacter xylosoxidans SadB having        the amino acid sequence of SEQ ID NO: 26. In another embodiment,        step (a) comprises monitoring a change in A340 nm.

In another embodiment, the method for screening candidate polypeptideshaving alcohol dehydrogenase activity comprises:

-   -   (a) measuring one or more of the following values for the        candidate polypeptides:        -   (i) the K_(M) value for a lower alkyl aldehyde;        -   (ii) the K_(I) value for a lower alkyl alcohol; and        -   (iii) k_(cat)/K_(M); and    -   (b) selecting only those candidate polypeptides having one or        more of the following characteristics:        -   (i) the K_(M) value for a lower alkyl aldehyde is lower            relative to a control polypeptide;        -   (ii) the K_(I) value for a lower alkyl alcohol is higher            relative to a control polypeptide; and        -   (iii) the k_(cat)/K_(M) value for a lower alkyl aldehyde is            higher relative to a control polypeptide.

In yet another embodiment, the control polypeptide is Achromobacterxylosoxidans SadB having the amino acid sequence of SEQ ID NO: 26. Inanother embodiment, the selected candidate polypeptides have two or moreof the above characteristics. In another embodiment, the selectedcandidate polypeptides have three or more of the above characteristics.In another embodiment, the selected candidate polypeptidespreferentially use NADH as a cofactor.

In one embodiment of the invention, polynucleotide sequences suitablefor use in the screening methods of the invention comprise nucleotidesequences that are at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%identical to SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,and SEQ ID NO: 20. In another embodiment of the invention, apolynucleotide sequence suitable for use in the screening methods of theinvention can be selected from the group consisting of: SEQ ID NO: 1,SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, and SEQ ID NO: 20 or an activevariant, fragment or derivative thereof. In one embodiment,polynucleotides have been codon-optimized for expression in a specifichost cell.

In one embodiment of the invention, candidate polypeptides suitable foruse in the screening methods of the invention have amino acid sequencesthat are at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%,89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identicalto SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO 34,SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO:39, and SEQ ID NO: 40. In another embodiment of the invention, acandidate polypeptide suitable for use in the screening methods of theinvention has an amino acid sequence selected from the group consistingof: SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO 34, SEQ ID NO: 35,SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, and SEQ IDNO: 40, or an active variant, fragment or derivative thereof. In oneembodiment, candidate polypeptides suitable for use in the screeningmethods of the invention have been codon-optimized for expression in aspecific host cell.

In one embodiment of the invention, the polynucleotide sequence suitablefor use in the screening methods of the invention has a nucleotidesequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%,89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identityto SEQ ID NO: 2. In another embodiment, the polynucleotide comprises thenucleotide sequence of SEQ ID NO: 2 or an active variant, fragment orderivative thereof.

In one embodiment of the invention, candidate polypeptides for use inthe screening methods comprise an amino acid sequence having at least80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 22. Inanother embodiment, the candidate polypeptide comprises the amino acidsequence of SEQ ID NO: 22 or an active variant, fragment or derivativethereof.

In one embodiment of the invention, the polynucleotide sequence suitablefor use in the screening methods has a nucleotide sequence having atleast 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 3. Inanother embodiment, the polynucleotide comprises the nucleotide sequenceof SEQ ID NO: 3 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, candidate polypeptides for use inthe screening methods comprise an amino acid sequence having at least80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 23. Inanother embodiment, the candidate polypeptide comprises the amino acidsequence of SEQ ID NO: 23 or an active variant, fragment or derivativethereof.

In one embodiment of the invention, the polynucleotide sequence for usein the screening methods has a nucleotide sequence having at least 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 11. In anotherembodiment, the polynucleotide comprises the nucleotide sequence of SEQID NO: 11 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, candidate polypeptides for use inthe screening methods comprise an amino acid sequence having at least80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 31. Inanother embodiment, the candidate polypeptide comprises the amino acidsequence of SEQ ID NO: 31 or an active variant, fragment or derivativethereof.

In one embodiment of the invention, the polynucleotide sequence for usein the screening methods has a nucleotide sequence having at least 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 9. In anotherembodiment, the polynucleotide comprises the nucleotide sequence of SEQID NO: 9 or an active variant, fragment or derivative thereof.

In one embodiment of the invention, candidate polypeptides for use inthe screening methods comprise an amino acid sequence having at least80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 29. Inanother embodiment, the candidate polypeptide comprises the amino acidsequence of SEQ ID NO: 29 or an active variant, fragment or derivativethereof.

In another embodiment, the method for screening candidate polypeptidesresults in selected candidate polypeptides being able to catalyze theconversion of an aldehyde to an alcohol at a temperature up to about 70°C. In another embodiment, the screening method results in selectedcandidate polypeptides being able to catalyze the conversion of analdehyde to an alcohol at a temperature of about 10° C., 15° C., 20° C.,25° C., 30° C., 35° C., 40° C., 45° C., 50° C., 55° C., 60° C., 65° C.,or 70° C. In another embodiment, the screening method results inselected candidate polypeptides being able to catalyze the conversion ofan aldehyde to an alcohol at a temperature of about 30° C.

In another embodiment, the method for screening candidate polypeptidesresults in selected candidate polypeptides being able to catalyze theconversion of an aldehyde to an alcohol at a pH from about 4 to about 9.In another embodiment, the screening method results in selectedcandidate polypeptides being able to catalyze the conversion of analdehyde to an alcohol at pH from about 5 to about 8. In anotherembodiment, the screening method results in selected candidatepolypeptides being able to catalyze the conversion of an aldehyde to analcohol at a pH from about 6 to about 7. In another embodiment, thescreening method results in selected candidate polypeptides being ableto catalyze the conversion of an aldehyde to an alcohol at a pH fromabout 6.5 to about 7. In another embodiment, the screening methodresults in selected candidate polypeptides being able to catalyze theconversion of an aldehyde to an alcohol at a pH of about 4, 4.5, 5, 5.5,6, 6.5, 7, 7.5, 8, 8.5, or 9. In another embodiment, the screeningmethod results in selected candidate polypeptides being able to catalyzethe conversion of an aldehyde to an alcohol at a pH of about 7.

In another embodiment, the method for screening candidate polypeptidesresults in selected candidate polypeptides that can catalyze theconversion of an aldehyde to an alcohol in the presence of a lower alkylalcohol at a concentration up to about 50 g/L. In another embodiment,the screening method results in selected candidate polypeptides beingable to catalyze the conversion of an aldehyde to an alcohol at aconcentration of about 10 g/L, 15 g/L, 20 g/L, 25 g/L, 30 g/L, 35 g/L,40 g/L, 45 g/L, or 50 g/L. In another embodiment, the screening methodresults in selected candidate polypeptides being able to catalyze theconversion of an aldehyde to an alcohol at a concentration of at leastabout 20 g/L.

Non-limiting examples of lower alkyl alcohols that can be used in thescreening methods of the invention include butanol, isobutanol,propanol, isopropanol, and ethanol. In one embodiment, the lower alkylalcohol used in the screening method is isobutanol.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. In case of conflict, thepresent application including the definitions will control. Also, unlessotherwise required by context, singular terms shall include pluralitiesand plural terms shall include the singular. All publications, patentsand other references mentioned herein are incorporated by reference intheir entireties for all purposes.

EXAMPLES

The present invention is further defined in the following Examples. Itshould be understood that these Examples, while indicating embodimentsof the invention, are given by way of illustration only. From the abovediscussion and these Examples, one skilled in the art can ascertain theessential characteristics of this invention, and without departing fromthe spirit and scope thereof, can make various changes and modificationsof the invention to adapt it to various uses and conditions.

General Methods

Standard recombinant DNA and molecular cloning techniques used in theExamples are well known in the art and are described by Sambrook et al.(Sambrook, J., Fritsch, E. F. and Maniatis, T. (Molecular Cloning: ALaboratory Manual; Cold Spring Harbor Laboratory Press, Cold SpringHarbor, 1989, here in referred to as Maniatis) and by Ausubel et al.(Ausubel et al., Current Protocols in Molecular Biology, pub. by GreenePublishing Assoc. and Wiley-Interscience, 1987).

Materials and methods suitable for the maintenance and growth ofbacterial cultures are well known in the art. Techniques suitable foruse in the following examples may be found as set out in Manual ofMethods for General Bacteriology (Phillipp et al., eds., AmericanSociety for Microbiology, Washington, D.C., 1994) or by Thomas D. Brockin (Brock, Biotechnology: A Textbook of Industrial Microbiology, SecondEdition, Sinauer Associates, Inc., Sunderland, Mass. (1989). Allreagents, restriction enzymes and materials used for the growth andmaintenance of bacterial cells were obtained from Sigma-AldrichChemicals (St. Louis, Mo.), BD Diagnostic Systems (Sparks, Md.),Invitrogen (Carlsbad, Calif.), HiMedia (Mumbai, India), SD Finechemicals (India), or Takara Bio Inc. (Shiga, Japan), unless otherwisespecified.

The meaning of abbreviations is as follows: “sec” means second(s), “min”means minute(s), “h” means hour(s), “nm” means nanometers, “uL” meansmicroliter(s), “mL” means milliliter(s), “mg/mL” means milligram permilliliter, “L” means liter(s), “nm” means nanometers, “mM” meansmillimolar, “M” means molar, “mmol” means millimole(s), “μmole” meansmicromole(s), “kg” means kilogram, “g” means gram(s), “μg” meansmicrogram(s) and “ng” means nanogram(s), “PCR” means polymerase chainreaction, “OD” means optical density, “OD600” means the optical densitymeasured at a wavelength of 600 nm, “kDa” means kilodaltons, “g” canalso mean the gravitation constant, “bp” means base pair(s), “kbp” meanskilobase pair(s), “kb” means kilobase, “%” means percent, “% w/v” meansweight/volume percent, “% v/v” means volume/volume percent, “HPLC” meanshigh performance liquid chromatography, “g/L” means gram per liter,“μg/L” means microgram per liter, “ng/μL” means nanogram per microliter,“pmol/μL” means picomol per microliter, “RPM” means rotation per minute,“pmol/min/mg” means micromole per minute per milligram, “w/v” meansweight per volume, “v/v” means volume per volume.

Example 1 Selection of Potential Isobutyraldehyde Dehydrogenases forScreening

This example describes the basis for the selection of several ADHcandidate enzymes for identifying efficient isobutyraldehydedehydrogenases. Clostridium acetobutylicum Butanol Dehydrogenase A and B(BdhA and BdhB) were chosen for analysis based on the literatureevidence. Achromobacter xylosoxidans was selected by enriching anenvironmental sludge sample on medium containing 1-butanol. The organismwas then cultured and used to purify protein fraction that containedbutanol dehydrogenase activity, subsequent to which the genecorresponding to the Secondary Alcohol Dehydrogenase B (SadB) was clonedas described in U.S. Patent Application Publication No. US 2009-0269823A1. The horse-liver ADH enzyme (HLADH) is commercially available and wasreported to have isobutanol oxidation activity by Green et al. in J.Biol. Chem. 268:7792 (1993).

Desirable properties of an ideal isobutyraldehyde dehydrogenasecandidate for the isobutanol production pathway have been describedabove.

An extensive literature search identified those candidate ADH enzymeswith either a high k_(cat) and/or low K_(M) values for isobutyraldehydeor other closely-related aldehydes, or with a lower k_(cat) and/orhigher K_(M) for isobutanol or other closely-related alcohols. ProteinBLAST searches against nonredundant protein sequence database (nr) atNCBI were performed using horse liver ADH, Achromobacter xylosoxidansSadB, and Saccharomyces cerevisiae ADH6 as queries, respectively. Allthe BLAST hits were collected and combined, from which sequences withmore than 95% sequence identity to each other were removed. Multiplesequence alignment (MSA) was created from the set of remaining95%-nonredundant sequences and a phylogenetic tree was generated fromthe MSA using the neighbor joining method. Similarly, MSA andphylogenetic tree were generated separately for a number of selected ADHenzymes to identify closely-related homologs of each enzyme where thealignment consisted of only the BLAST hits obtained using the targetenzyme as the query. These enzymes included Achromobacter xylosoxidansSadB, Saccharomyces cerevisiae ADH6, and Saccharomyces cerevisiae ADH7.Based on these analyses several candidates were selected (Table 3) forevaluation of performance.

Example 2 Cloning, Protein Expression and Purification, and Screeningfor a Suitable Isobutyraldehyde Dehydrogenase

This example describes preparation of ADH-gene constructs forover-expression/purification and measurement of enzyme activities usinga time-course assay. Horse-liver ADH (HLADH; A-6128) was purchased fromSigma. Achromobacter xylosoxidans SadB (SadB), Saccharomyces cerevisiaeADH6 (ScADH6) and ADH7 (ScADH7), Entamoeba histolytica ADH1 (EhADH1),Bos Taurus Aldehyde Reductase (BtARD), Beijerinckia indica subsp. IndicaATCC 9039 (BiADH), Clostridium beijerinckii ADH (CbADH), Rana pereziADH8 (RpADH8), Rattus norvegicus ADH1 (RnADH1), Thermus sp. ATN1 ADH(TADH), Phenylobacterium zucineum HLK1 ADH (PzADH), Methylocellasilvestris BL2 ADH (MsADH), Acinetobacter baumannii AYE ADH (AbADH),Geobacillus sp. WCH70 ADH (GbADH), Vanderwaltozyma polyspora DSM 70294ADH (VpADH), Mucor circinelloides ADH (McADH), and Rhodococcuserythropolis PR4 ADH (ReADH) were the candidates for which subcloneswere prepared for protein expression and purification.

Construction of Plasmid Constructs Expressing ADH Candidates

The gene-coding regions of EhADH1, BtARD, CbADH, BiADH, and RpADH8 weresynthesized by DNA 2.0 (Menlo Park, Calif.) and those of RnADH1, TADH,PzADH, MsADH, AbADH, GbADH, VpADH, McADH, and ReADH were synthesized byGENEART AG (Germany) after optimizing the codons for expression inEscherichia coli. The amino-acid sequences for these candidates wereprocured from the Genbank Protein database and provided to DNA 2.0 orGeneart AG for codon optimization. Each coding region was flanked byXhoI and KpnI sites at the 5′ and 3′ ends of the coding sequence,respectively. These constructs were cloned and supplied in either DNA2.0's vector pJ201 or Geneart's pMA vector.

The plasmids were transformed into chemically competent TOP10 cells(Invitrogen) and amplified by growing the transformants in liquid LBmedia containing either 25 mg/ml Kanamycin or 100 mg/ml Ampicillin. Theplasmids, which were purified from overnight cultures (grown at 37° C.),were restricted with XhoI (NEB; R0146) and KpnI (NEB; R0142) and ligatedinto the corresponding sites in-frame with an N-terminal hexa-histidinetag in the vector pBADHisA (Invitrogen; V43001) using the DNA ligationkit Version 2.1 from Takara Bio Inc. (6022).

The ligation products were transformed into chemically competent TOP10cells (Invitrogen; C4040-50). The transformed cells were streaked on aplate containing the LB medium plus 100 mg/mL ampicillin. Clonescontaining the ADH inserts were confirmed by restriction digestion withXhoI/KpnI. Plasmids with the correct insert contained the expected 1.2kbp band in each case. The cloned sequence was confirmed via DNAsequencing. The resulting clones were named as pBADHisA::EhADH1,pBADHisA::BtARD, pBADHisA::CbADH, pBADHisA::BiADH, pBADHisA::RpADH8,pBADHisA::RnADH1, pBADHisA::TADH, pBADHisA::PzADH, pBADHisA::MsADH,pBADHisA::AbADH, pBADHisA::GbADH, pBADHisA::VpADH, pBADHisA::McADH, andpBADHisA::ReADH, respectively.

SadB, an enzyme which was previously examined, was PCR-amplified withKOD polymerase enzyme (Novagen), as per the procedure mentioned in theproduct manual, from pTrc99a::SadB using primers SadBXhoI-f(CCATGGAATCTCGAGATGAAAGCTCTGGTTTACC, SEQ ID NO: 41) and SadBKpnI-r(GATCCCCGGGTACCGAGCTCGAATTC, SEQ ID NO: 42) to introduce XhoI and KpnIsites at the 5′ and 3′ ends, respectively. After confirmation of the PCRproduct via agarose-gel electrophoresis, the 1.2-kb PCR product wasrestricted with XhoI and KpnI and cloned into pBADHisA as describedabove for the other candidate genes. The genes for ScADH6 and ScADH7were each amplified from 100 ng of genomic DNA of the yeast wild-typestrain BY4741 (ATCC 201388) using primers ADH6_XhoI_f(CAAGAAAACTCGAGATCATGTCTTATCCTGAG, SEQ ID NO: 43) and ADH6_KpnI_r(GAGCTTGGTACCCTAGTCTGAAAATTCTTTG, SEQ ID NO: 44) for ScADH6 andADH7_XhoI_f (CTGAAAAACTCGAGAAAAAAATGCTTTACCC, SEQ ID NO: 45) andADH7_KpnI_r (GAAAAATATTAGGTACCTAGACTATTTATGG, SEQ ID NO: 46) for ScADH7.The strategy and PCR conditions were identical to those used for theamplification of SadB. The genes were then cloned into the XhoI and KpnIsites of pBADHisA, as per the procedure described above. The plasmidscontaining SadB, ScADH6 and ScADH7 were labeled as pBADHisA::SadB,pBADHisA::ScADH6 and pBADHisA:: ScADH7, respectively.

Expression of Recombinant ADHs in E. coli

For the data shown, either BL21-CodonPlus (Invitrogen; 230240) or aproprietary E. coli strain were used for the overexpression of ADHenzymes. However, it is believed that commercially available strains,such as BL21-codon plus, are suitable for overexpression of ADH enzymes.

Expression plasmids (pBADHisA plasmids) containing ADH genes wereprepared from 3-mL overnight cultures of Top10 transformants usingQiaprep spin miniprep kit (Qiagen, Valencia Calif.; 27106) followingmanufacturer's instructions. One ng of each of the plasmid wastransformed into either BL21-CodonPlus or proprietary E. colielectro-competent cells using a Bio RAD Gene Pulser II (Bio-RadLaboratories Inc, Hercules, Calif.) by following the manufacturer'sdirections. The transformed cells were spread onto agar platescontaining the LB medium plus 100 μg/mL of each of ampicillin andspectinomycin. The plates were incubated overnight at 37° C. Coloniesfrom these plates innoculated in 3.0 mL of the LB medium containing 100μg/mL of each of ampicillin and spectinomycin, at 37° C. while shakingat 250 rpm. Cells from these starter cultures (grown overnight) wereused to innoculate 1-L media at a dilution of 1:1000. The cells wereinduced with 0.02% Arabinose after the culture reached an OD of ˜0.8.The induction was carried out at 37° C. while shaking at 250 rpmovernight. The cells were then harvested by centrifugation at 4000 g for10 min at 4° C. The cells were lysed by treatment with 40 ml ofBugBuster master mix (Novagen; 71456-4), in the presence of Complete,EDTA-free Protease Inhibitor Cocktail tablets (Roche; 11873580001) and 1mg/ml Lysozyme, by placing on a rocker at 4° C. for 30 min. The celldebris was removed by centrifugation at 16,000 g for 20 min at 4° C.

The total protein concentration in samples was measured by the BradfordsAssay using Bradford's dye concentrate (Bio-Rad). The samples andprotein standards (Bovine Serum Albumin, BSA) were set up in eitherindividual cuvettes (1-mL reactions) or a 96-well microplate followingthe manufacturer's protocol. The concentrations of proteins werecalculated from absorbance values at 595 nm, measured using either aCary 100 Bio UV-Visible spectrophotometer (Varian, Inc.) or a SpectraMaxplate reader (Molecular Devices Corporation, Sunnyvale, Calif.).

ADH Enzyme Purification and Activity Assays

Cell-free extracts prepared from 1-litre cultures as per the proceduredescribed above, was directly used to purify the various expressed ADHenzymes via IMAC (immobilized metal affinity chromatography) affinitychromatography on 5-mL HisTrap FF columns (GE Healthcare Life Sciences;175255-01). The entire procedure was carried out using an AKTAexplorer10 S (GE Healthcare Life Sciences; 18-1145-05) FPLC system. The extractswere mixed with 30 mM Imidazole and loaded onto the HisTrap columns.Upon loading, the column was washed with 50 mM Sodium phosphate buffer,pH 8.0, containing 30 mM Imidazole (approximately ˜10-20 column volumes)to get rid of unbound and non-specifically bound proteins. The ADHprotein was then eluted with a gradient of 30 mM to 500 mM Imidazoleover 20 column volumes. The peak fractions were electrophoresed on 10%Bis-Tris SDS-PAGE gels (Invitrogen; NP0301) using Invitrogen's XCellSureLock Mini-Gel apparatus (EI0001). Upon coomassie staining anddestaining, it could be ascertained that the fractions were more than95% pure and contained only the ADH protein. Activity assays werecarried out to ensure that the purified proteins were active.

As a routine practice, the crude extracts and purified proteins wereassayed for butanol oxidation activity, in order to ensure that therecombinant proteins were active throughout the purification process. Inthe reductive direction, isobutyraldehyde reduction assays were carriedout with NADH or NADPH as the cofactor and an excess of theisobutyraldehyde substrate (40 mM). In each case, enzymatic activity wasmeasured for 1 min at 30° C. in 1-ml reactions by following the decreaseor increase in the absorbance at 340 nm using a Cary Bio 100 UV-Visiblespectrophotometer (Varian Inc.), depending on whether the NADH/NADPH isbeing consumed (absorbance is decreased) or generated (absorbance isincreased) in the reaction. Alcohol oxidation activities were carriedout in 50 mM sodium phosphate buffer at pH 8.8 and aldehyde reductionreactions were assayed in 100 mM potassium phosphate buffer at pH 7.0.Depending on the nature of reaction being carried out, the enzyme andcofactor stocks were diluted in the reaction buffers at the respectivepHs. Either buffer or cell extract prepared from the proprietary E. colistrain (with no ADH plasmid) was used as the negative control for assayswith purified protein and cell-free extracts, respectively.

In initial experiments, there were insufficient levels of proteinexpression with EhADH1 and RpADH8. Subsequently, the activity assaysfailed to detect ADH activity in the cell extracts expressing theseenzymes. Likewise initially, although the BtARD showed good levels ofprotein expression and the protein could be purified to homogeneity, ithad no detectable activity under the conditions used for the assay. Itis believed that one of skill in the art could further optimizeexpression and assay conditions for these candidates. Sufficient amountsof active protein could be purified with all other enzymes for whichdata are presented. Cofactor specificities were measured with all theseenzymes in isobutyraldehyde reduction reactions (as in proc mentionedabove), using either NADH or NADPH as cofactors. In each case, at leasta 10-fold difference was observed in the activity numbers, when eitherNADH or NADPH was used as a cofactor, as against the numbercorresponding to the other form of the cofactor. Table 6 summarizes thecofactor preferences for some of the ADH enzymes.

TABLE 6 COFACTOR CANDIDATE ADH PREFERENCE Horse-liver ADH NADHSaccharomyces cerevisiae ADH6 NADPH Saccharomyces cerevisiae ADH7 NADPHAchromobacter xylosoxidans SadB NADH Beijerickia indica ADH NADHClostridium beijerinckii ADH NADPH Rattus norvegicus ADH1 NADH Thermussp. ATN1 ADH NADH Phenylobacterium zucineum HLK1 NADH ADH Methylocellasilvestris BL2 ADH NADH Acinetobacter baumannii AYE ADH NADH Geobacillussp. WCH70 ADH NADPH Mucor circinelloides ADH NADH

Screening Purified ADH Candidates Using a Semi-Physiological Time-CourseAssay

The ideal way to characterize and compare various ADH candidates wouldbe to calculate and compare the full set of kinetic constants, i.e.,k_(cat) values for aldehyde reduction and alcohol oxidation, K_(M)values for isobutyraldehyde, isobutanol, NAD(P) and NAD(P)H, and K_(I)values for isobutyraldehyde and isobutanol. A detailed characterizationfor numerous candidates would require considerable expenditure of time,effort and money. Thus, a qualitative assay was developed to allow forquick and efficient comparison of several candidates. Asemi-physiological assay was designed to compare the performance ofvarious enzymes. The assays entail the initiation of all reactions witha constant amount of each enzyme. In this case, 1 ug of each enzyme wasused to initiate reactions that contained isobutyraldehyde and NADH atconcentrations 1 mM and 200 μM, respectively. Each reaction's timecourse was followed for 10 min by measuring the decrease in absorbanceat 340 nm, as the reaction proceeds towards equilibrium. An enzyme witha high k_(cat), would drive the reaction towards equilibrium faster thanan enzyme with a lower k_(cat). A parallel assay was also carried outunder identical conditions, but with the inclusion of 321 mM isobutanol(24 g/L) in the reaction. An enzyme that is relatively uninhibited bythis concentration of isobutanol would have a time course that closelymimics the time course in the absence of isobutanol. FIG. 1 comparestime courses exhibited by the ADH candidate enzymes in these assays.

Based on the results presented in FIG. 1, it is inferred that theBeijerickia indica ADH is likely to have the highest k_(cat) for theisobutyraldehyde reduction reaction and ADH6 is likely to be the leastinhibited by isobutanol in the reaction.

Example 3 Identification of Beijerinckia indica ADH with a High k_(cat)and a Low K_(M) for Isobutyraldehyde

Kinetic constants of the ADH enzymes were calculated and compared toidentify those candidate ADH enzymes with the most desirable propertiesfor the conversion of isobutyraldehyde to isobutanol in the last step ofthe engineered pathway for isobutanol production. The assays fordetermining the kinetic constants were carried out using initial ratesfrom the assays described above. Decreases in NADH can be correlatedwith aldehyde being consumed (Biochemistry by Voet and Voet, John Wiley& Sons, Inc.) However, the amount of a given enzyme used in the reactionwas in the range of 0.1 to 5 μg. The concentration of a given enzyme wassuch that it was conducive for the measurement of initial velocitiesover a 1-min time course. For each enzyme, Michaelis-Menten plots weregenerated with a broad range of substrate concentrations. Roughestimates of K_(M) were obtained, based on which the assays wereredesigned so as to use substrate concentrations in the range 0.5 to 10times the K_(M) value, to be able to obtain the appropriate kineticconstants. Isobutyraldehyde (isobutanal) reduction reactions werecarried out at 30° C. in 100 mM Potassium phosphate buffer, pH 7.0,containing 200 μM NADH. When calculating the K_(I) for isobutanol, thesame reactions were carried out in the presence of varyingconcentrations of isobutanol (generally 0-535 mM) in the reaction (seeFIG. 7, for example). Reactions with isobutanol substrate were performedat 30° C. in 50 mM Sodium phosphate buffer, pH 8.8, containing 7.5 mMNAD. The Enzyme kinetics module (Version 1.3) of SigmaPlot 11 (SystatSoftware, Inc.) was used to fit data to Michaelis-Menten equations andcalculate the kinetic constants. Kinetic constants obtained for theindicated ADH enzymes are given in Table 7. The k_(cat)/K_(M) is derivedfrom the individual numbers of k_(cat) and K_(M) and not anexperimentally determined value. The ratios of the K_(M), K_(I) andk_(cat)/K_(M) for each candidate enzyme as compared to the sameparameter for SadB are given in Table 9.

TABLE 7 K_(M) K_(I) k_(cat) (Isobututanal) (Isobutanol) Other enzymaticproperties and Enzyme (sec⁻¹) mM) (mM) k_(cat)/K_(M) cofactor preferenceHLADH* 8 0.1 2 82 [Isobutanol oxidation: k_(cat) = 5 sec⁻¹; K_(M) = 0.4mM] SadB* 109 1 180 105 K_(M) (NADH) = 0.02 mM [Isobutanol oxidation:k_(cat) = 2 sec⁻¹; K_(M) = 24 mM] ScADH6 47 0.6 1170 81 NADPH specificScADH7* 36 0.3 88 120 NADPH specific BiADH 283 0.2 36 1252 K_(M) (NADH)= 0.06 mM [Isobutanol oxidation: k_(cat) = 9 sec⁻¹; K_(M) = 4.7 mM]CbADH 123 1.5 ND 85 NADPH specific TADH 15 1.3 ND 11 NADH specificRnADH1 ~5 ≦0.003 ND ~1667 NADH specific

For those enzymes marked with an asterisk in Table 7, at least 3 assayswere performed with separate preparations of the enzyme. All othernumbers are values from either one assay or are averages from 2 assaysperformed with the same enzyme sample.

The data for Beijerickia indica ADH (BiADH) shows the highest number forthe k_(cat) and a reasonably high k_(cat)/K_(M), and is preferred. Theenzyme RnADH1 appears to have a low K_(M) value for isobutyraldehyde andconsequently may have a high catalytic efficiency. However, the lowK_(M) value precludes an accurate determination of its K_(M) value viaspectrophotometric assays. Nevertheless, the enzyme's performance in theisobutanol production host may be limited more by the k_(cat) if theintracellular steady-state levels of isobutyraldehyde are in excess ofits K_(M) value. Comparing BiADH with SadB, the former's catalyticefficiency for isobutyraldehyde reduction is ˜12 times more than that ofthe latter although it is more sensitive to isobutanol than SadB. Withregard to the nucleotide cofactor, SadB has a lower K_(M) value for NADHwhen compared with BiADH. ScADH6 has a high K_(I) value for isobutanol,indicating that this enzyme is likely to function in vivo, unfettered bythe presence of isobutanol at concentrations that are expected in anisobutanol production host. Among the candidates analyzed so far, SadBhas the least catalytic efficiency for isobutanol oxidation(k_(cat)/K_(M)=0.083), followed by BiADH (1.91) and HLADH (12.5).

Example 4

Seven additional candidate ADH enzymes were synthesized, expressed, andassayed according to methods such as described in Example 2. Kineticconstants obtained for the indicated ADH enzymes (Phenylobacteriumzucineum HLK1 ADH (PzADH), Methylocella silvestris BL2 ADH (MsADH),Acinetobacter baumannii AYE ADH (AbADH), Geobacillus sp. WCH70 ADH(GbADH), and Mucor circinelloides ADH (McADH)) are given in Table 8. Acomparison of K_(M), K_(I), and k_(cat)/K_(M) for each candidate enzymeas compared to the same parameter for SadB are given in Table 9 as apercentage of the values determined (Table 7) for SadB. Percentages lessthan 100 indicate a value less than that determined for SadB;percentages higher than 100 indicate a value greater than thatdetermined for SadB. There was no expression for Rhodococcuserythropolis PR4 ADH (ReADH) and no detectable activity forVanderwaltozyma polyspora DSM 70294 ADH (VpADH) in these assays.

TABLE 8 K_(M) K_(I) Other enzymatic (Isobututanal) (Isobutanol)properties and Enzyme k_(cat) (sec−1) (mM) (mM) k_(cat)/K_(M)* cofactorpreference PzADH 30 0.1 13 321 NADH specific No measureable conversionof isobutanol to isobutyraldehyde MsADH 33 0.06 19 530 NADH specific Nomeasureable conversion of isobutanol to isobutyraldehyde AbADH 99 10 30510 NADH specific No measureable conversion of isobutanol toisobutyraldehyde GbADH 32 0.4 13 72 NADPH specific No measureableconversion of isobutanol to isobutyraldehyde McADH 151 30 79 5 NADHspecific No measureable conversion of isobutanol to isobutyraldehyde

TABLE 9 Indicated parameter as a percentage of the same parameterdetermined for SadB Enzyme k_(cat) K_(M) K_(I) k_(cat)/K_(M) HLADH 7%10% 1% 78% SadB 100% 100% 100% 100% ScADH6 43% 60% 650% 77% ScADH7 33%30% 49% 114% BiADH 260% 20% 20% 1192% CbADH 113% 150% ND 81% TADH 14%130% ND 10% RnADH1 5% <1% ND 1588% PzADH 28% 10% 7% 243% MsADH 30% 6%11% 532% AbADH 91% 1020% 169% 9% GbADH 29% 44% 7% 69% McADH 138% 3000%44% 5%

Example 5 Construction of S. cerevisiae Strain PNY2211

PNY2211 was constructed in several steps from S. cerevisiae strainPNY1507 as described in U.S. Appl. No. 61/380,563, filed Sep. 7, 2010,and in the following paragraphs. First the strain was modified tocontain a phosophoketolase gene. Construction of phosphoketolase genecassettes and integration strains was previously described in U.S. Appl.No. 61/356,379, filed Jun. 18, 2010. Next, an acetolactate synthase gene(alsS) was added to the strain, using an integration vector previouslydescribed in U.S. Appl. No. 61/308,563. Finally, homologousrecombination was used to remove the phosphoketolase gene andintegration vector sequences, resulting in a scarless insertion of alsSin the intergenic region between pdc1Δ::ilvD (a previously describeddeletion/insertion of the PDC1 ORF in U.S. Appl. No. 61/308,563) and thenative TRX1 gene of chromosome XII. The resulting genotype of PNY2211 isMATa ura3Δ::loxP his3Δ pdc6Δpdc1Δ::P[PDC1]-DHAD|ilvD_Sm-PDClt-P[FBA1]-ALS|alsS_Bs-CYCltpdc5Δ::P[PDC5]-ADH|sadB_Ax-PDC5t d2Δ::loxP fra2Δadh1Δ::UAS(PGK1)P[FBA1]-kivD_L1(y)-ADHlt.

A phosphoketolase gene cassette was introduced into PNY1507 byhomologous recombination. The integration construct was generated asfollows. The plasmid pRS423::CUP1-alsS+FBA-budA (as described in U.S.Publ. No. 2009/0305363 A1) was digested with NotI and XmaI to remove the1.8 kb FBA-budA sequence, and the vector was religated after treatmentwith Klenow fragment. Next, the CUP 1 promoter was replaced with a TEF1promoter variant (M4 variant described by Nevoigt et al. Appl. Environ.Microbiol. 72(8): 5266-5273 (2006)) via DNA synthesis and vectorconstruction service from DNA2.0 (Menlo Park, Calif.). The resultingplasmid, pRS423::TEF(M4)-alsS was cut with StuI and MluI (removes 1.6 kbportion containing part of the alsS gene and CYC1 termintor), combinedwith the 4 kb PCR product generated from pRS426::GPD-xpk1+ADH-eutD (SEQID NO: 81; the plasmid is described in U.S. Appl. No. 61/356,379) withprimers N1176 and N1177 (SEQ ID NOs: 47 and 48, respectively) and an 0.8kb PCR product DNA generated from yeast genomic DNA (ENO1 promoterregion) with primers N822 and N1178 (SEQ ID NOs: 49 and 50,respectively) and transformed into S. cerevisiae strain BY4741(ATCC#201388; gap repair cloning methodology, see Ma and Botstein).Transformants were obtained by plating cells on synthetic completemedium without histidine. Proper assembly of the expected plasmid(pRS423::TEF(M4)-xpk1+ENO1-eutD, SEQ ID No: 51) was confirmed by PCRusing primers N821 and N1115 (SEQ ID NOs: 52 and 53, respectively) andby restriction digest (BglI). Two clones were subsequently sequenced.The 3.1 kb TEF(M4)-xpk1 gene was isolated by digestion with SacI andNotI and cloned into the pUC19-URA3::ilvD-TRX1 vector described in U.S.Appl. No. 61/356,379 (Clone A, cut with AflII). Cloning fragments weretreated with Klenow fragment to generate blunt ends for ligation.Ligation reactions were transformed into E. coli Stb13 cells, selectingfor ampicillin resistance. Insertion of TEF(M4)-xpk1 was confirmed byPCR using primers N1110 and N1114 (SEQ ID NOs: 54 and 55, respectively).The vector was linearized with AflII and treated with Klenow fragment.The 1.8 kb KpnI-HincII geneticin resistance cassette described in U.S.Appl. No. 61/356,379 was cloned by ligation after Klenow fragmenttreatment. Ligation reactions were transformed into E. coli Stb13 cells,selecting for ampicillin resistance. Insertion of the geneticin cassettewas confirmed by PCR using primers N160SeqF5 and BK468 (SEQ ID NOs: 56and 57, respectively). The plasmid sequence is provided as SEQ ID NO: 58(pUC19-URA3::pdc1::TEF(M4)-xpk1::kan).

The resulting integration cassette (pdc1::TEF(M4)-xpk1::KanMX::TRX1) wasisolated (AscI and NaeI digestion generated a 5.3 kb band that was gelpurified) and transformed into PNY1507 using the Zymo Research Frozen-EZYeast Transformation Kit (Cat. No. T2001). Transformants were selectedby plating on YPE plus 50 μg/ml G418. Integration at the expected locuswas confirmed by PCR using primers N886 and N1214 (SEQ ID NOs: 59 and60, respectively). Next, plasmid pRS423::GAL1p-Cre, encoding Crerecombinase, was used to remove the loxP-flanked KanMX cassette (vectorand methods described in U.S. Appl. No. 61/308,563). Proper removal ofthe cassette was confirmed by PCR using primers oBP512 and N160SeqF5(SEQ ID NOs: 61 and 62, respectively). Finally, the alsS integrationplasmid described in U.S. Appl. No. 61/308,563(pUC19-kan::pdc1::FBA-alsS::TRX1, clone A) was transformed into thisstrain using the included geneticin selection marker. Two integrantswere tested for acetolactate synthase activity by transformation withplasmids pYZ090ΔalsS and pBP915 (plasmids described in U.S. Appl. No.61/308,563, transformed using Protocol #2 in “Methods in Yeast Genetics”2005. Amberg, Burke and Strathern) and evaluation of growth andisobutanol production in glucose-containing media (methods for growthand isobutanol measurement are described in U.S. Appl. No. 61/308,563and U.S. Publ. No. 2007/0092957 A1). One of the two clones was positiveand was named PNY2218. An isolate of PNY2218 containing the plasmidspYZ090ΔalsS and pBP915 was designated PNY2209.

PNY2218 was treated with Cre recombinase and resulting clones werescreened for loss of the xpk1 gene and pUC19 integration vectorsequences by PCR using primers N886 and N160SeqR5 (SEQ ID NOs: 59 and56, respectively). This leaves only the alsS gene integrated in thepdc1-TRX1 intergenic region after recombination the DNA upstream of xpk1and the homologous DNA introduced during insertion of the integrationvector (a “scarless” insertion since vector, marker gene and loxPsequences are lost, see FIG. 9). Although this recombination could haveoccurred at any point, the vector integration appeared to be stable evenwithout geneticin selection and the recombination event was onlyobserved after introduction of the Cre recombinase. One clone wasdesignated PNY2211.

Example 6 Construction of Saccharomyces cerevisiae Strain PNY1540

The purpose of this example is to describe the construction ofSaccharomyces cerevisiae strain PNY1540 from strain PNY2211. This strainwas derived from CEN.PK 113-7D (CBS 8340; Centraalbureau voorSchimmelcultures (CBS) Fungal Biodiversiry Centre, Netherlands) and isdescribed in Example 5 above. PNY1540 contains a deletion of the sadBgene, from Achromobacter xylosoxidans, which had been integrated at thePDC5 locus in PNY2211. The deletion, which completely removed the entirecoding sequence, was created by homologous recombination with a PCRfragment containing regions of homology upstream and downstream of thetarget gene and a URA3 gene for selection of transformants. The URA3gene was removed by homologous recombination to create a scarlessdeletion.

The scarless deletion procedure was adapted from Akada et al. 2006 Yeastv23 p399. The PCR cassette for the scarless deletion was made bycombining four fragments, A-B-U-C, by overlapping PCR. The PCR cassettecontained a selectable/counter-selectable marker, URA3 (Fragment U),consisting of the native CEN.PK 113-7D URA3 gene, along with thepromoter (250 bp upstream of the URA3 gene) and terminator (150 bpdownstream of the URA3 gene). Fragments A and C, each 500 bp long,corresponded to the 500 bp immediately upstream of the target gene(Fragment A) and the 3′ 500 bp of the target gene (Fragment C).Fragments A and C were used for integration of the cassette into thechromosome by homologous recombination. Fragment B (254 bp long)corresponded to the sequence immediately downstream of the target geneand was used for excision of the URA3 marker and Fragment C from thechromosome by homologous recombination, as a direct repeat of thesequence corresponding to Fragment B was created upon integration of thecassette into the chromosome. Using the PCR product ABUC cassette, theURA3 marker was first integrated into and then excised from thechromosome by homologous recombination. The initial integration deletedthe gene, excluding the 3′ 500 bp. Upon excision, the 3′ 500 bp regionof the gene was also deleted.

sadB Deletion

The four fragments for the PCR cassette for the scarless sadB deletionwere amplified using Phusion High Fidelity PCR Master Mix (New EnglandBioLabs; Ipswich, Mass.) and CEN.PK 113-7D genomic DNA as template forFragment U and PNY1503 genomic DNA as template for Fragments A, B, andC. Genomic DNA was prepared with a Gentra Puregene Yeast/Bact kit(Qiagen; Valencia, Calif.). sadB Fragment A was amplified with primeroBP540 (SEQ ID NO: 63) and primer oBP835 (SEQ ID NO: 64), containing a5′ tail with homology to the 5′ end of sadB Fragment B. sadB Fragment Bwas amplified with primer oBP836 (SEQ ID NO: 65), containing a 5′ tailwith homology to the 3′ end of sadB Fragment A, and primer oBP837 (SEQID NO: 66), containing a 5′ tail with homology to the 5′ end of sadBFragment U. sadB Fragment U was amplified with primer oBP838 (SEQ ID NO:67), containing a 5′ tail with homology to the 3′ end of sadB FragmentB, and primer oBP839 (SEQ ID NO: 68), containing a 5′ tail with homologyto the 5′ end of sadB Fragment C. sadB Fragment C was amplified withprimer oBP840 (SEQ ID NO: 69), containing a 5′ tail with homology to the3′ end of sadB Fragment U, and primer oBP841 (SEQ ID NO: 70). PCRproducts were purified with a PCR Purification kit (Qiagen). sadBFragment AB was created by overlapping PCR by mixing sadB Fragment A andsadB Fragment B and amplifying with primers oBP540 (SEQ ID NO: 63) andoBP837 (SEQ ID NO: 66). sadB Fragment UC was created by overlapping PCRby mixing sadB Fragment U and sadB Fragment C and amplifying withprimers oBP838 (SEQ ID NO: 67) and oBP841 (SEQ ID NO: 70). The resultingPCR products were purified on an agarose gel followed by a GelExtraction kit (Qiagen). The sadB ABUC cassette was created byoverlapping PCR by mixing sadB Fragment AB and sadB Fragment UC andamplifying with primers oBP540 (SEQ ID NO: 63) and oBP841 (SEQ ID NO:70). The PCR product was purified with a PCR Purification kit (Qiagen).

Competent cells of PNY2211 were made and transformed with the sadB ABUCPCR cassette using a Frozen-EZ Yeast Transformation II kit (ZymoResearch; Orange, Calif.). Transformation mixtures were plated onsynthetic complete media lacking uracil supplemented with 1% ethanol at30 C. Transformants with a sadB knockout were screened for by PCR withprimers Ura3-end (SEQ ID NO: 71) and oBP541 (SEQ ID NO: 72). A correcttransformant was grown in YPE (1% ethanol) and plated on syntheticcomplete medium containing 5-fluoro-orotic acid (0.1%) at 30 C to selectfor isolates that lost the URA3 marker. The deletion and marker removalwere confirmed by PCR with primers oBP540 (SEQ ID NO: 63) and oBP541(SEQ ID NO: 72) using genomic DNA prepared with a YeaStar Genomic DNAKit (Zymo Research). The absence of the sadB gene from the isolate wasdemonstrated by a negative PCR result using primers specific for thedeleted coding sequence of sadB, oBP530 (SEQ ID NO: 73) and oBP531 (SEQID NO: 74). A correct isolate was selected as strain PNY1540 (BP1746).

Example 7 Construction of a Yeast Shuttle Vector Carrying a GeneEncoding the B. indica ADH and a Negative Control Vector

The plasmid pLH468 (SEQ ID NO: 75), as described in U.S. Publ. No.2009/0305363 A1, is an E. coli/yeast shuttle vector that carries 3chimeric genes encoding enzymes that comprise part of an isobutanolproduction pathway (dihydroxyacid dehydratase, aKIV decarboxylase andisobutanol dehydrogenase). The existing isobutanol dehydrogenase genewas replaced by the B. indica ADH using gap repair cloning methodology.The B. indica ADH coding region with suitable 5′ and 3′ flankingsequences was first obtained via DNA synthesis (DNA2.0, Menlo Park,Calif.) with yeast codon optimization. The sequence is provided (SEQ IDNO:76). The vector μLH468 was linearized with Bsu36I and transformedalong with the B. indica ADH (released from the supplier's cloningvector with EcoRI and BamHI) into yeast strain BY4741. Transformantswere plated on synthetic complete medium without histidine (Teknova Cat.No. C3020). Plasmids were prepared from several transformants using aZymoprep™ Yeast Plasmid Miniprep kit (Zymo Research Cat. No. D2004). PCR(with primers N1092 and N1093, SEQ ID NOs: 77 and 78) and restrictionenzyme digestion (with KpnI) were used to confirm incorporation of BiADHin the intended location. This plasmid is referred to as pLH468::BiADH.

A second vector was constructed that eliminated the most of the originalisobutanol dehydrogenase gene (hADH) from pLH468. This was done byreleasing a 808 bp fragment via digestion with Bsu36I and Pad, fillingin the ends of the DNA with Klenow fragment and re-ligating the vector.The ligation reaction was transformed into E. coli Stb13 cells. Loss ofthe hADH gene was confirmed by EcoRI digestion of isolated plasmidcones. One successful clone was selected for the experiment described inExample 8, below. The plasmid is referred to as μLH468ΔhADH.

Example 8 Isobutanologen Strains Carrying BiADH Display BetterGlucose-Dependent Growth, Higher Glucose Consumption and HigherIsobutanol Titer and Yield than Control Strains

The plasmids pLH468::BiADH and pLH468ΔhADH were each transformed alongwith a second isobutanol pathway plasmid (pYZ090ΔalsS, U.S. Appl. No.61/380,563) into PNY1540. Transformations were plated on syntheticcomplete medium lacking histidine and uracil, containing 1% ethanol ascarbon source. Several transformants were patched to fresh plates. After48 hours, patches (3 of each strain) were used to inoculate syntheticcomplete medium (minus histidine and uracil) containing 0.3% glucose and0.3% ethanol as carbon sources. After 24 hours, growth in this mediumwas similar for all replicates of both strains. Cultures were thensub-cultured into synthetic complete medium (minus histidine and uracil)containing 2% glucose and 0.05% ethanol as carbon sources. Cultures(starting optical density (OD) at 600 nm was 0.2, culture volume was 20ml in 125 ml tightly-capped flasks) were incubated 48 hours. Sampleswere collected for HPLC analysis at the time of subculture and againafter 48 hours. The final ODs were also determined. The average 48 h ODfor the BiADH strain was 3.3 (+1-0.1) compared to 2.37 (+1-0.07) for theno ADH control. Thus inclusion of BiADH increased OD by 39% under theseconditions. Similarly, glucose consumption (assessed by HPLC compared tosamples collected immediately after sub-culturing) was increased by 69%(81+/−1 mM vs. 47.9+/−0.6 mM). Isobutanol titers were 4-fold higher andmolar yields (i.e. yield of isobutanol per mole of glucose consumed)were doubled as shown in table below. In the no ADH control strain,significant carbon from the isobutanol pathway accumulated asisobutyrate, indicating that aldehyde dehydrogenases were acting uponisobutyraldehyde.

TABLE 10 Isobutanol Isobutyrate Isobutyraldehyde (mM) (mM) (mM) TITERSPNY1540/ 32.3 (±0.6)  10.9 (±0.3)  ND pLH468::BiADH PNY1540/ 6.2 (±0.2)18.4 (±0.4)  2.1 (±0.4) pLH468ΔADH MOLAR YIELDS PNY1540/ 0.401 (±0.006)0.135 (±0.005) ND pLH468::BiADH PNY1540/ 0.129 (±0.004) 0.384 (±0.004)0.044 (±0.008) pLH468ΔADH

What is claimed is:
 1. A recombinant microbial host cell comprising: abiosynthetic pathway for production of a lower alkyl alcohol, thebiosynthetic pathway comprising a substrate to product conversioncatalyzed by a polypeptide with alcohol dehydrogenase activity and oneor more of the following characteristics: (a) the K_(M) value forisobutyraldehyde is lower for said polypeptide relative to a controlpolypeptide having the amino acid sequence of SEQ ID NO: 26; (b) theK_(I) value for isobutanol for said polypeptide is higher relative to acontrol polypeptide having the amino acid sequence of SEQ ID NO: 26; and(c) the k_(cat)/K_(M) value isobutyraldehyde for said polypeptide ishigher relative to a control polypeptide having the amino acid sequenceof SEQ ID NO:
 26. 2. The recombinant microbial host cell of claim 1,wherein the biosynthetic pathway for production of a lower alkyl alcoholis a butanol, propanol, isopropanol, or ethanol biosynthetic pathway. 3.The recombinant microbial host cell of claim 1, wherein the polypeptidewith alcohol dehydrogenase activity has at least 95% identity to theamino acid sequence of SEQ ID NO: 21, 22, 23, 24, 25, 31, 32, 34, 35,36, 37, or
 38. 4. The recombinant microbial host cell of claim 1,wherein the polypeptide with alcohol dehydrogenase activity has theamino acid sequence of SEQ ID NO:
 31. 5. The recombinant host cell ofclaim 1 wherein the polypeptide with alcohol dehydrogenase activity isencoded by a polynucleotide having at least 85% identity to a nucleotidesequence of SEQ ID NO: 1, 2, 3, 4, 5, 6, 11, 12, 14, 15, 16, or
 17. 6.The recombinant microbial host cell of claim 1 wherein the polypeptidewith alcohol dehydrogenase activity preferentially uses NADH as acofactor.
 7. The recombinant microbial host cell of claim 1, whereinsaid polypeptide having alcohol dehydrogenase activity catalyzes theconversion of isobutyraldehyde to isobutanol in the presence ofisobutanol at a concentration of at least about 15 g/L.
 8. Therecombinant microbial host cell of claim 1, wherein the biosyntheticpathway for production of a lower alkyl alcohol is a butanolbiosynthetic pathway.
 9. The recombinant microbial host cell of claim 1wherein the biosynthetic pathway for production of a lower alkyl alcoholis an isobutanol biosynthetic pathway comprising heterologouspolynucleotides encoding polypeptides that catalyze substrate to productconversions for each step of the following steps: (a) pyruvate toacetolactate; (b) acetolactate to 2,3-dihydroxyisovalerate; (c)2,3-dihydroxyisovalerate to α-ketoisovalerate; (d) α-ketoisovalerate toisobutyraldehyde; and (e) isobutyraldehyde to isobutanol; and whereinsaid microbial host cell produces isobutanol.
 10. The recombinantmicrobial host cell of claim 1 wherein the biosynthetic pathway forproduction of a lower alkyl alcohol is an isobutanol biosyntheticpathway comprising heterologous polynucleotides encoding polypeptidesthat catalyze substrate to product conversions for each step of thefollowing steps: (a) pyruvate to acetolactate; (b) acetolactate to2,3-dihydroxyisovalerate; (c) 2,3-dihydroxyisovalerate toα-ketoisovalerate; (d) α-ketoisovalerate to isobutyryl-CoA; (e)isobutyryl-CoA to isobutyraldehyde; and (f) isobutyraldehyde toisobutanol; and wherein said microbial host cell produces isobutanol.11. The recombinant microbial host cell of claim 1 wherein thebiosynthetic pathway for production of a lower alkyl alcohol is anisobutanol biosynthetic pathway comprising heterologous polynucleotidesencoding polypeptides that catalyze substrate to product conversions foreach step of the following steps: (a) pyruvate to acetolactate; (b)acetolactate to 2,3-dihydroxyisovalerate; (c) 2,3-dihydroxyisovalerateto α-ketoisovalerate; (d) α-ketoisovalerate to valine; (e) valine toisobutylamine; (e) isobutylamine to isobutyraldehyde; and (f)isobutyraldehyde to isobutanol; and wherein said microbial host cellproduces isobutanol.
 12. A recombinant microbial host cell comprising abiosynthetic pathway for the production of a lower alkyl alcohol and aheterologous polynucleotide encoding a polypeptide with alcoholdehydrogenase activity having at least 85% identity to the amino acidsequence of SEQ ID NO: 21, 22, 23, 24, 25, 31, 32, 34, 35, 36, 37, or38.
 13. The recombinant microbial host cell of claim 12, wherein thebiosynthetic pathway for the production of a lower alkyl alcohol is a2-butanol biosynthetic pathway comprising heterologous polynucleotidesencoding polypeptides that catalyze substrate to product conversions foreach of the following steps: (a) pyruvate to alpha-acetolactate; (b)alpha-acetolactate to acetoin; (c) acetoin to 2,3-butanediol; (d)2,3-butanediol to 2-butanone; and (e) 2-butanone to 2-butanol; andwherein said microbial host cell produces 2-butanol.
 14. The recombinantmicrobial host cell of claim 12, wherein the biosynthetic pathway forthe production of a lower alkyl alcohol is a 1-butanol biosyntheticpathway comprises heterologous polynucleotides encoding polypeptidesthat catalyze substrate to product conversions for each of the followingsteps: (a) acetyl-CoA to acetoacetyl-CoA; (b) acetoacetyl-CoA to3-hydroxybutyryl-CoA; (c) 3-hydroxybutyryl-CoA to crotonyl-CoA; (d)crotonyl-CoA to butyryl-CoA; (e) butyryl-CoA to butyraldehyde; and (f)butyraldehyde to 1-butanol; and wherein said microbial host cellproduces 1-butanol.
 15. The recombinant host cell of claim 12 whereinsaid polypeptide having alcohol dehydrogenase activity comprises anamino acid sequence with at least 95% identity to the amino acidsequence of SEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37,or
 38. 16. The recombinant host cell of claim 12 wherein saidpolypeptide having alcohol dehydrogenase activity comprises an aminoacid sequence with at least 95% identity to the amino acid sequence ofSEQ ID NO:
 31. 17. The recombinant host cell of claim 1 or claim 12wherein the genus of said host cell is selected from the groupconsisting of: Saccharomyces, Pichia, Hansenula, Yarrowia, Aspergillus,Kluyveromyces, Pachysolen, Rhodotorula, Zygosaccharomyces, Galactomyces,Schizosaccharomyces, Torulaspora, Debayomyces, Williopsis, Dekkera,Kloeckera, Metschnikowia, Issatchenkia, and Candida.
 18. A method forproducing isobutanol comprising: (a) providing a recombinant microbialhost cell comprising an isobutanol biosynthetic pathway, the pathwaycomprising a heterologous polypeptide which catalyzes the substrate toproduct conversion of isobutyraldehyde to isobutanol wherein thepolypeptide has at least 90% identity to the amino acid sequence of SEQID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37, or 38; and (b)contacting the host cell of (a) with a carbon substrate under conditionswhereby isobutanol is produced.
 19. The method of claim 18 wherein theheterologous polypeptide which catalyzes the substrate to productconversion of isobutyraldehyde to isobutanol has at least 95% identityto the amino acid sequence of SEQ ID NO:
 31. 20. The method of claim 18wherein the heterologous polypeptide which catalyzes the substrate toproduct conversion of isobutyraldehyde to isobutanol has the amino acidsequence of SEQ ID NO:
 31. 21. A method for producing 2-butanolcomprising: (a) providing a recombinant microbial host cell comprising a2-butanol biosynthetic pathway, the pathway comprising a heterologouspolypeptide having at least 90% identity to the amino acid sequence ofSEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37, or 38; and(b) contacting the host cell of (a) with a carbon substrate underconditions whereby 2-butanol is produced.
 22. The method of claim 21wherein the heterologous polypeptide has at least 95% identity to theamino acid sequence of SEQ ID NO:
 31. 23. A method for producing1-butanol comprising: (a) providing a recombinant microbial host cellcomprising a 1-butanol biosynthetic pathway, the pathway comprising aheterologous polypeptide having at least 90% identity to the amino acidsequence of SEQ ID NO: 21, 22, 23, 24, 25, 27, 31, 32, 34, 35, 36, 37,or 38; and (b) contacting the host cell of (a) with a carbon substrateunder conditions whereby 1-butanol is produced.
 24. The method of claim23 wherein the heterologous polypeptide has at least 95% identity to theamino acid sequence of SEQ ID NO: 31.